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NUCLEIC ACID SEQUENCES TO PROTEINS INVOLVED IN TOCOPHEROL 

SYNTHESIS 

5 

INTRODUCTION 



This application claims the benefit of the filing date of the provisional Application U.S. 
Serial Number 60/129,899, filed April 15, 1999, and the provisional Application, U.S. Serial 
1 0 Number 60/146,461 , filed July 30, 1999. 

^ TECHNICAL FIELD 

tfJ The present invention is directed to nucleic acid and amino acid sequences and 

JE constructs, and methods related thereto. 



f BACKGROUND 

T Isoprenoids are ubiquitous compounds found in all living organisms. Plants synthesize a 

^ diverse array of greater than 22,000 isoprenoids (Connolly and Hill ( 1 992) Dictionary of 
M;; Terpenoids, Chapman and Hall, New York, NY). In plants, isoprenoids play essential roles in 
gp particular cell functions such as production of sterols, contributing to eukaryotic membrane 
U architecture, acyclic polyprenoids found in the side chain of ubiquinone and plastoquinone, 

growth regulators like abscisic acid, gibberellins, brassinosteroids or the photosynthetic pigments 
chlorophylls and carotenoids. Although the physiological role of other plant isoprenoids is less 
evident, like that of the vast array of secondary metabolites, some are known to play key roles 
2 5 mediating the adaptative responses to different environmental challenges. In spite of the 

remarkable diversity of structure and function, all isoprenoids originate from a single metabolic 
precursor, isopentenyl diphosphate (IPP) (Wright, (1961) Annu. Rev. Biochem. 20:525-548; and 
Spurgeon and Porter, (1981) in Biosynthesis of Isoprenoid Compounds ., Porter and Spurgeon eds 
(John Wiley, New York) Vol. 1, ppl-46). 
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A number of unique and interconnected biochemical pathways derived from the 
isoprenoid pathway leading to secondary metabolites, including tocopherols, exist in chloroplasts 
of higher plants. Tocopherols not only perform vital functions in plants, but are also important 
from mammalian nutritional perspectives. In plastids, tocopherols account for up to 40% of the 

5 total quinone pool. 

Tocopherols and tocotrienols (unsaturated tocopherol derivatives) are well known 
antioxidants, and play an important role in protecting cells from free radical damage, and in the 
prevention of many diseases, including cardiac disease, cancer, cataracts, retinopathy, 
Alzheimer's disease, andneurodegeneration, and have been shown to have beneficial effects on 

1 0 symptoms of arthritis, and in anti-aging. Vitamin E is used in chicken feed for improving the 
shelf life, appearance, flavor, and oxidative stability of meat, and to transfer tocols from feed to 

P eggs. Vitamin E has been shown to be essential for normal reproduction, improves overall 

S performance, and enhances immunocompetence in livestock animals. Vitamin E supplement in 

+ ; animal feed also imparts oxidative stability to milk products. 

2§ The demand for natural tocopherols as supplements has been steadily growing at a rate of 

fi 10-20% for the past three years. At present, the demand exceeds the supply for natural 
t_ tocopherols, which are known to be more biopotent than racemic mixtures of synthetically 
|= produced tocopherols. Naturally occurring tocopherols are all d-stereomers, whereas synthetic <x- 
J tocopherol is a mixture of eight rf, /-a-tocopherol isomers, only one of which (12.5%) is identical 
pp to the natural rf-a-tocopherol. Natural d-a-tocopherol has the highest vitamin E activity ( 1 .49 
IU/mg) when compared to other natural tocopherols or tocotrienols. The synthetic a-tocopherol 
has a vitamin E activity of 1.1 IU/mg. In 1995, the worldwide market for raw refined 
tocopherols was $1020 million; synthetic materials comprised 85-88% of the market, the 
remaining 12-15% being natural materials. The best sources of natural tocopherols and 

2 5 tocotrienols are vegetable oils and grain products. Currently, most of the natural Vitamin E is 

produced from ^-tocopherol derived from soy oil processing, which is subsequently converted to 
a-tocopherol by chemical modification (a-tocopherol exhibits the greatest biological activity). 

Methods of enhancing the levels of tocopherols and tocotrienols in plants, especially levels 
of the more desirable compounds that can be used directly, without chemical modification, would be 

3 0 useful to the art as such molecules exhibit better functionality and biovailability. 
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In addition, methods for the increased production of other isoprenoid derived compounds in 
a host plant cell is desirable. Furthermore, methods for the production of particular isoprenoid 
compounds in a host plant cell is also needed. 



SUMMARY OF THE INVENTION 

The present invention is directed to prenyltransferase (PT), and in particular to FT 
polynucleotides and polypeptides. The polynucleotides and polypeptides of the present invention 
1 0 include those derived from prokaryotic and eukaryotic sources. 

Thus, one aspect of the present invention relates to isolated polynucleotide sequences 
r : encoding prenyltransferase proteins. In particular, isolated nucleic acid sequences encoding PT 
* proteins from bacterial and plant sources are provided. 

i Another aspect of the present invention relates to oligonucleotides which include partial 

j| or complete PT encoding sequences. 

£ It is also an aspect of the present invention to provide recombinant DNA constructs which 

*~ can be used for transcription or transcription and translation (expression) of prenyltransferase. In 
J particular, constructs are provided which are capable of transcription or transcription and 
translation in host cells. 

Sb In another aspect of the present invention, methods are provided for production of 

W prenyltransferase in a host cell or progeny thereof. In particular, host cells are transformed or 
transfected with a DNA construct which can be used for transcription or transcription and 
translation of prenyltransferase. The recombinant cells which contain prenyltransferase are also 
part of the present invention. 
25 In a further aspect, the present invention relates to methods of using polynucleotide and 

polypeptide sequences to modify the tocopherol content of host cells, particularly in host plant 
cells. Plant cells having such a modified tocopherol content are also contemplated herein. 

The modified plants, seeds and oils obtained by the expression of the prenyltransferases 
are also considered part of the invention. 

30 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 provides an amino acid sequence alignment between ATPT2, ATPT3, ATPT4, 
ATPT8, and ATPT12 are performed using ClustalW. 
5 Figure 2 provides a schematic picture of the expression construct pCGN10800. 

Figure 3 provides a schematic picture of the expression construct pCGN10801. 
Figure 4 provides a schematic picture of the expression construct pCGN10803. 
Figure 5 provides a schematic picture of the expression construct pCGN10806. 
Figure 6 provides a schematic picture of the expression construct pCGN10807. 
1 o Figure 7 provides a schematic picture of the expression construct pCGN10808. 

Figure 8 provides a schematic picture of the expression construct pCGN 10809. 
r : Figure 9 provides a schematic picture of the expression construct pCGN 108 10. 

S Figure 10 provides a schematic picture of the expression construct pCGN1081 1. 

i- Figure 1 1 provides a schematic picture of the expression construct pCGN 108 12. 

fS Figure 12 provides a schematic picture of the expression construct pCGN10813. 

f Figure 13 provides a schematic picture of the expression construct pCGN10814. 

Figure 14 provides a schematic picture of the expression construct pCGN10815. 
4= Figure 15 provides a schematic picture of the expression construct pCGN10816. 

t Figure 16 provides a schematic picture of the expression construct pCGN10817. 

jo Figure 17 provides a schematic picture of the expression construct pCGN108 19. 

" Figure 1 8 provides a schematic picture of the expression construct pCGN10824. 

Figure 19 provides a schematic picture of the expression construct pCGN10825. 
Figure 20 provides a schematic picture of the expression construct pCGN10826. 
Figure 2 1 provides an amino acid sequence alignment using ClustalW between the 

2 5 Synechocystis sequence knockouts. 

Figure 22 provides an amino acid sequence of the ATPT2, ATPT3, ATPT4, ATPT8, and 
ATPT12 protein sequences from Arabidopsis and the slrl736, slr0926, slll899, slr0056, and the 
slrl518 amino acid sequences from Synechocystis. 

Figure 23 provides the results of the enzymatic assay from preparations of wild type 

3 0 Synechocystis strain 6803, and Synechocystis slrl736 knockout. 
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Figure 24 provides bar graphs of HPLC data obtained from seed extracts of transgenic 
Arabidopsis containing pCGN10822, which provides of the expression of the ATPT2 sequence, 
in the sense orientation, from the napin promoter. Provided are graphs for alpha, gamma, and 
delta tocopherols, as well as total tocopherol for 22 transformed lines, as well as a 
nontransformed (wildtype) control. 

Figure 25 provides a bar graph of HPLC analysis of seed extracts fwmArabidopsis plants 
transformed with pCGN10803 (35S-ATPT2, in the antisense orientation), pCGN10802 (line 
1625, napin ATPT2 in the sense orientation), pCGN10809 (line 1627, 35S-ATPT3 in the sense 
orientation), a nontransformed (wt) constrol, and a empty vector transformed control. 



„ DETAILED DESCRD7TION OF THE INVENTION 

S! The present invention provides, inter alia, compositions and methods for altering (for 

§ example, increasing and decreasing) the tocopherol levels and/or modulating their ratios in host 
1 cells. In particular, the present invention provides polynucleotides, polypeptides, and methods of 
T use thereof for the modulation of tocopherol content in host plant cells. 

I The present invention provides polynucleotide and polypeptide sequences involved in the 

k prenylation of straight chain and aromatic compounds. Straight chain prenyl transferases as used 
So herein comprises sequences which encode proteins involved in the prenylation of straight chain 
° compounds, including, but not limited to, geranyl geranyl pyrophosphate andfarnesyl 

pyrophosphate. Aromatic prenyl transferases, as used herein, comprises sequences which encode 
proteins involved in the prenylation of aromatic compounds, including, but not limited to, 
menaquinone, ubiquinone, chlorophyll, and homogentisic acid. The prenyl transferase of the 

2 5 present invention preferably prenylates homogentisic acid. 

The biosynthesis of oc-tocopherol in higher plants involves condensation of homogentisic 
acid and phytylpyrophosphate to form 2-methyl-6 phytylbenzoquinol that can, by cyclization and 
subsequent methylations (Fiedler et al., 1982, Planta, 155: 511-515, Soil et al., 1980, Arch. 
Biochem. Biophys. 204: 544-550, Marshall et al., 1985 Phytochem., 24: 1705-1711, all of which 

3 0 are herein incorporated by reference in their entirety), form various tocopherols. The Arabidopsis 
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P ds2 mutant identified and characterized by Norris et al. (1995), is deficient in tocopherol and 
plastiquinone-9 accumulation. Further genetic and biochemical analysis suggests that the protein 
encoded by PDS2 may be responsible for the prenylation of homogentisic acid. This may be a 
rate limiting step in tocopherol biosynthesis, and this gene has yet to be isolated. Thus, it is an 
aspect of the present invention to provide polynucleotides and polypeptides involved in the 
prenylation of homogentisic acid. 

Isolated Polynucleotides, Proteins, and Polypeptides 

A first aspect of the present invention relates to isolated prenyltransferase 
polynucleotides. The polynucleotide sequences of the present invention include isolated 
polynucleotides that encode the polypeptides of the invention having a deduced amino acid 
sequence selected from the group of sequences set forth in the Sequence Listing and to other 
polynucleotide sequences closely related to such sequences and variants thereof. 

The invention provides a polynucleotide sequence identical over its entire length to each 
coding sequence as set forth in the Sequence Listing. The invention also provides the coding 
sequence for the mature polypeptide or a fragment thereof, as well as the coding sequence for the 
mature polypeptide or a fragment thereof in a reading frame with other coding sequences, such as 
those encoding a leader or secretory sequence, a pre-, pro-, orprepro- protein sequence. The 
polynucleotide can also include non-coding sequences, including for example, but not limited to, 
non-coding 5' and 3' sequences, such as the transcribed, untranslated sequences, termination 
signals, ribosome binding sites, sequences that stabilize mRNA, introns, polyadenylation signals, 
and additional coding sequence that encodes additional amino acids. For example, a marker 
sequence can be included to facilitate the purification of the fused polypeptide. Polynucleotides 
of the present invention also include polynucleotides comprising a structural gene and the 
naturally associated sequences that control gene expression. 

The invention also includes polynucleotides of the formula: 
X-(Ri)„-(R 2 )-(R3)n-Y 

wherein, at the 5' end, X is hydrogen, and at the 3' end, Y is hydrogen or a metal, Ri and R 3 are 
any nucleic acid residue, n is an integer between 1 and 3000, preferably between 1 and 1000 and 
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R 2 is a nucleic acid sequence of the invention, particularly a nucleic acid sequence selected from 
the group set forth in the Sequence Listing and preferably those of SEQ IDNOs; 1, 3, 5, 7, 8, 10, 
1 1, 13-16, 18, 23, 29, 36, and 38. In the formula, R2 is oriented so that its 5' end residue is at the 
left, bound to Ri, and its 3' end residue is at the right, bound to R3. Any stretch of nucleic acid 
5 residues denoted by either R group, where R is greater than 1, may be either aheteropolymer or a 
homopolymer, preferably a heteropolymer. 

The invention also relates to variants of the polynucleotides described herein that encode 
for variants of the polypeptides of the invention. Variants that are fragments of the 
polynucleotides of the invention can be used to synthesize full-length polynucleotides of the 

10 invention. Preferred embodiments are polynucleotides encoding polypeptide variants wherein 5 
to 10, 1 to 5, 1 to 3, 2, 1 or no amino acid residues of a polypeptide sequence of the invention are 

O substituted, added or deleted, in any combination. Particularly preferred are substitutions, 

additions, and deletions that are silent such that they do not alter the properties or activities of the 

*r polynucleotide or polypeptide. 

305 Further preferred embodiments of the invention that are at least 50%, 60%, or 70% 

J identical over their entire length to a polynucleotide encoding a polypeptide of the invention, and 
jL polynucleotides that are complementary to such polynucleotides. More preferable are 
4 polynucleotides that comprise a region that is at least 80% identical over its entire length to a 
V* polynucleotide encoding a polypeptide of the invention and polynucleotides that are 
§0 complementary thereto. In this regard, polynucleotides at least 90% identical over their entire 
length are particularly preferred, those at least 95% identical are especially preferred. Further, 
those with at least 97% identity are highly preferred and those with at least 98% and 99% identity 
are particularly highly preferred, with those at least 99% being the most highly preferred. 
Preferred embodiments are polynucleotides that encode polypeptides that retain 

2 5 substantially the same biological function or activity as the mature polypeptides encoded by the 

polynucleotides set forth in the Sequence Listing. 

The invention further relates to polynucleotides that hybridize to the above-described 
sequences. In particular, the invention relates to polynucleotides that hybridize under stringent 
conditions to the above-described polynucleotides. As used herein, the terms "stringent 

3 0 conditions" and "stringent hybridization conditions" mean that hybridization will generally occur 
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if there is at least 95% and preferably at least 97% identity between the sequences. An example 
of stringent hybridization conditions is overnight incubation at 42°C in a solution comprising 
50% formamide, 5x SSC (150 mM NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate 
(pH 7.6), 5x Denhardt's solution, 10% dextran sulfate, and 20 micrograms/milliliter denatured, 
5 sheared salmon sperm DNA, followed by washing the hybridization support in 0. lx SSC at 
approximately 65°C. Other hybridization and wash conditions are well known and are 
exemplified in Sambrook, et aL 9 Molecular Cloning: A Laboratory Manual, Second Edition, cold 
Spring Harbor, NY (1989), particularly Chapter 11. 

The invention also provides a polynucleotide consisting essentially of a polynucleotide 

1 0 sequence obtainable by screening an appropriate library containing the complete gene for a 

polynucleotide sequence set for in the Sequence Listing under stringent hybridization conditions 

U with a probe having the sequence of said polynucleotide sequence or a fragment thereof; and 

[S isolating said polynucleotide sequence. Fragments useful for obtaining such a polynucleotide 

! |! include, for example, probes and primers as described herein. 

15 As discussed herein regarding polynucleotide assays of the invention, for example, 

m polynucleotides of the invention can be used as a hybridization probe for RNA, cDN A, or 
^ genomic DNA to isolate full length cDNAs or genomic clones encoding a polypeptide and to 

isolate cDNA or genomic clones of other genes that have a high sequence similarity to a 
^ polynucleotide set forth in the Sequence Listing. Such probes will generally comprise at least 15 
3?0 bases. Preferably such probes will have at least 30 bases and can have at least 50 bases. 

Particularly preferred probes will have between 30 bases and 50 bases, inclusive. 

The coding region of each gene that comprises or is comprised by a polynucleotide 

sequence set forth in the Sequence Listing may be isolated by screening using a DNA sequence 

provided in the Sequence Listing to synthesize an oligonucleotide probe. A labeled 

2 5 oligonucleotide having a sequence complementary to that of a gene of the invention is then used 

to screen a library of cDNA, genomic DNA or mRNA to identify members of the library which 
hybridize to the probe. For example, synthetic oligonucleotides are prepared which correspond 
to the prenyltransferase EST sequences. The oligonucleotides are used as primers in polymerase 
chain reaction (PCR) techniques to obtain 5' and 3 ? terminal sequence of prenyl transferase 

3 0 genes. Alternatively, where oligonucleotides of low degeneracy can be prepared from particular 
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prenyltransferase peptides, such probes may be used directly to screen gene libraries for 
prenyltransferase gene sequences. In particular, screening of cDNA libraries inphage vectors is 
useful in such methods due to lower levels of background hybridization. 

Typically, a prenyltransferase sequence obtainable from the use of nucleic acid probes 
5 will show 60-70% sequence identity between the target prenyltransferase sequence and the 
encoding sequence used as a probe. However, lengthy sequences with as little as 50-60% 
sequence identity may also be obtained. The nucleic acid probes may be a lengthy fragment of 
the nucleic acid sequence, or may also be a shorter, oligonucleotide probe. When longer nucleic 
acid fragments are employed as probes (greater than about 100 bp), one may screen at lower 
10 stringencies in order to obtain sequences from the target sample which have 20-50% deviation 
(i.e., 50-80% sequence homology) from the sequences used as probe. Oligonucleotide probes 

0 can be considerably shorter than the entire nucleic acid sequence encoding an prenyltransferase 

1 enzyme, but should be at least about 10, preferably at least about 15, and more preferably at least 
f about 20 nucleotides. A higher degree of sequence identity is desired when shorter regions are 
ft used as opposed to longer regions. It may thus be desirable to identify regions of highly 

| conserved amino acid sequence to design oligonucleotide probes for detecting and recovering 
L other related prenyltransferase genes. Shorter probes are often particularly useful for polymerase 
% chain reactions (PCR), especially when highly conserved sequences can be identified. QSee, 
J Gould, etal.,PNASUSA (1989) 86:1934-1938.). 

lo Another aspect of the present invention relates to prenyltransferase polypeptides. Such 

U polypeptides include isolated polypeptides set forth in the Sequence Listing, as well as 
polypeptides and fragments thereof, particularly those polypeptides which exhibit 
prenyltransferase activity and also those polypeptides which have at least 50%, 60% or 70% 
identity, preferably at least 80% identity, more preferably at least 90% identity, and most 

2 5 preferably at least 95% identity to a polypeptide sequence selected from the group of sequences 

set forth in the Sequence Listing, and also include portions of such polypeptides, wherein such 
portion of the polypeptide preferably includes at least 30 amino acids and more preferably 
includes at least 50 amino acids. 

"Identity", as is well understood in the art, is a relationship between two or more 

3 0 polypeptide sequences or two or more polynucleotide sequences, as determined by comparing the 
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sequences. In the art, "identity" also means the degree of sequence relatedness between 
polypeptide or polynucleotide sequences, as determined by the match between strings of such 
sequences. "Identity" can be readily calculated by known methods including, but not limited to, 
those described in Computational Molecular Biology, Lesk, A.M., ed., Oxford University Press, 
New York (1988); Biocomputing: Informatics and Genome Projects, Smith, D.W., ed., 
Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part I, Griffin, A.M. 
and Griffin, H.G., eds., Humana Press, New Jersey (1994); Sequence Analysis in Molecular 
Biology, von Heinje, G., Academic Press (1987); Sequence Analysis Primer, Gribskov, M. and 
Devereux, J., eds., Stockton Press, New York (1991); and Carillo, H., and Lipman, D., SIAM J 
Applied Math, 48:1073 (1988). Methods to determine identity are designed to give the largest 
match between the sequences tested. Moreover, methods to determine identity are codified in 
publicly available programs. Computer programs which can be used to determine identity 
between two sequences include, but are not limited to, GCG (Devereux, J., et al, Nucleic Acids 
Research 12(1):387 (1984); suite of five BLAST programs, three designed for nucleotide 
sequences queries (BLASTN, BLASTX, and TBLASTX) and two designed for protein sequence 
queries (BLASTP and TBLASTN) (Coulson, Trends in Biotechnology, 12: 76-80 (1994); Birren, 
et al, Genome Analysis, 1: 543-559 (1997)). The BLAST X program is publicly available from 
NCBI and other sources {BLAST Manual, Altschul, S.,etaL, NCBI NLM NTH, Bethesda, MD 
20894; Altschul, S., et al., J. Mol. Biol, 215:403-410 (1990)). The well known Smith Waterman 
algorithm can also be used to determine identity. 

Parameters for polypeptide sequence comparison typically include the following: 
Algorithm: Needleman and Wunsch, /. Mol Biol. 48:443-453 (1970) 
Comparison matrix: BLOSSUM62 from Hentikoff and Hentikoff, Proc. Natl. Acad. Sci 
USA 89:10915-10919 (1992) 
Gap Penalty: 12 
Gap Length Penalty: 4 

A program which can be used with these parameters is publicly available as the "gap- 
program from Genetics Computer Group, Madison Wisconsin. The above parameters along with 
no penalty for end gap are the default parameters for peptide comparisons. 

Parameters for polynucleotide sequence comparison include the following: 
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Algorithm: Needleman and Wunsch, J. MoL Biol. 48:443-453 (1970) 
Comparison matrix: matches = +10; mismatches = 0 
Gap Penalty: 50 
Gap Length Penalty: 3 

5 A program which can be used with these parameters is publicly available as the "gap" 

program from Genetics Computer Group, Madison Wisconsin. The above parameters are the 
default parameters for nucleic acid comparisons. 

The invention also includes polypeptides of the formula: 
X-CROn-CR^-CRaVY 

10 wherein, at the amino terminus, X is hydrogen, and at the carboxyl terminus, Y is hydrogen or a 

metal, Ri and R3 are any amino acid residue, n is an integer between 1 and 1000, and R2 is an 
rj amino acid sequence of the invention, particularly an amino acid sequence selected from the 
?*£ group set forth in the Sequence Listing and preferably those encoded by the sequences provided 
£ in SEQ ID NOs: 2, 4, 6, 9, 12, 17, 19-22, 24-28, 30, 32-35, 37, and 39. In the formula, R 2 is 
JJ> oriented so that its amino terminal residue is at the left, bound to Ri, and its carboxy terminal 
J residue is at the right, bound to R 3 . Any stretch of amino acid residues denoted by either R 

* group, where R is greater than 1, may be either aheteropolymer or a homopolymer, preferably a 

Q 

heteropolymer. 

r l Polypeptides of the present invention include isolated polypeptides encoded by a 

WO polynucleotide comprising a sequence selected from the group of a sequence contained in the 
~" Sequence Listing set forth herein . 

The polypeptides of the present invention can be mature protein or can be part of a fusion 

protein. 

Fragments and variants of the polypeptides are also considered to be a part of the 

2 5 invention. A fragment is a variant polypeptide which has an amino acid sequence that is entirely 

the same as part but not all of the amino acid sequence of the previously described polypeptides. 
The fragments can be "free-standing" or comprised within a larger polypeptide of which the 
fragment forms a part or a region, most preferably as a single continuous region. Preferred 
fragments are biologically active fragments which are those fragments that mediate activities of 

3 0 the polypeptides of the invention, including those with similar activity or improved activity or 
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with a decreased activity. Also included are those fragments that antigenic or immunogenic in an 

animal, particularly a human. 

Variants of the polypeptide also include polypeptides that vary from the sequences set 

forth in the Sequence Listing by conservative amino acid substitutions, substitution of a residue 
5 by another with like characteristics. In general, such substitutions are among Ala, Val, Leu and 

He; between Ser and Thr; between Asp and Glu; between Asn and Gin; between Lys and Arg; or 

between Phe and Tyr. Particularly preferred are variants in which 5 to 10; 1 to 5; 1 to 3 or one 

amino acid(s) are substituted, deleted, or added, in any combination. 

Variants that are fragments of the polypeptides of the invention can be used to produce 
1 0 the corresponding full length polypeptide by peptide synthesis. Therefore, these variants can be 

used as intermediates for producing the full-length polypeptides of the invention. 
q The polynucleotides and polypeptides of the invention can be used, for example, in the 

{z transformation of host cells, such as plant host cells, as further discussed herein. 
4 The invention also provides polynucleotides that encode a polypeptide that is a mature 
JfS protein plus additional amino or carboxyl-terminal amino acids, or amino acids within the mature 
5 polypeptide (for example, when the mature form of the protein has more than one polypeptide 
^ chain). Such sequences can, for example, play a role in the processing of a protein from a 
£ precursor to a mature form, allow protein transport, shorten or lengthen protein half-life, or 
1 p facilitate manipulation of the protein in assays or production. It is contemplated that cellular 
fSQ enzymes can be used to remove any additional amino acids from the mature protein. 

A precursor protein, having the mature form of the polypeptide fused to one or more 

prosequences may be an inactive form of the polypeptide. The inactive precursors generally are 

activated when the prosequences are removed. Some or all of the prosequences may be removed 

prior to activation. Such precursor protein are generally called proproteins. 

25 

Plant Constructs and Methods of Use 

Of particular interest is the use of the nucleotide sequences in recombinant DN A 
constructs to direct the transcription or transcription and translation (expression) of the 
3 0 prenyltransferase sequences of the present invention in a host plant cell. The expression 
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constructs generally comprise a promoter functional in a host plant cell operably linked to a 
nucleic acid sequence encoding a prenyltransf erase of the present invention and a transcriptional 
termination region functional in a host plant cell. 

A first nucleic acid sequence is "operably linked" or "operably associated" with a second 
5 nucleic acid sequence when the sequences are so arranged that the first nucleic acid sequence 
affects the function of the second nucleic-acid sequence. Preferably, the two sequences are part 
of a single contiguous nucleic acid molecule and more preferably are adjacent. For example, a 
promoter is operably linked to a gene if the promoter regulates or mediates transcription of the 
gene in a cell. 

1 o Those skilled in the art will recognize that there are a number of promoters which are 

functional in plant cells, and have been described in the literature. Chloroplast and plastid 
specific promoters, chloroplast or plastid functional promoters, and chloroplast or plastid 

* operable promoters are also envisioned. 

% One set of plant functional promoters are constitutive promoters such as the CaMV35S or 

;§5 FMV35S promoters that yield high levels of expression in most plant organs. Enhanced or 
1 duplicated versions of the CaMV35S and FMV35S promoters are useful in the practice of this 
f invention (Odell, et al. (1985) Nature 313:810-812; Rogers, U.S. Patent Number 5,378, 619). In 
9 addition, it may also be preferred to bring about expression of the prenyltransferase gene in 
f specific tissues of the plant, such as leaf, stem, root, tuber, seed, fruit, etc., and the promoter 
|jo chosen should have the desired tissue and developmental specificity. 
° Of particular interest is the expression of the nucleic acid sequences of the present 

invention from transcription initiation regions which are preferentially expressed in a plant seed 
tissue. Examples of such seed preferential transcription initiation sequences include those 
sequences derived from sequences encoding plant storage protein genes or from genes involved 
25 in fatty acid biosynthesis in oilseeds. Examples of such promoters include the 5' regulatory 

regions from such genes as napin (Kridl et al, Seed Sci. Res. i:209:219 (1991)), phaseolin, zein, 
soybean trypsin inhibitor, ACP, stearoyl-ACP desaturase, soybean a' subunit of P-conglycinin 
(soy 7s, (Chen etal, Proc. Natl. Acad. Sci., 83:8560-8564 (1986))) and oleosin. 

It may be advantageous to direct the localization of proteins conferring prenyltransferase 
30 to a particular subcellular compartment, for example, to the mitochondrion, endoplasmic 
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reticulum, vacuoles, chloroplast or other plastidic compartment. For example, where the genes of 
interest of the present invention will be targeted to plastids, such as chloroplasts, for expression, 
the constructs will also employ the use of sequences to direct the gene to the plastid. Such 
sequences are referred to herein as chloroplast transit peptides (CTP) or plastid transit peptides 
(PTP). In this manner, where the gene of interest is not directly inserted into the plastid, the 
expression construct will additionally contain a gene encoding a transit peptide to direct the gene 
of interest to the plastid. The chloroplast transit peptides may be derived from the gene of 
interest, or may be derived from a heterologous sequence having a CTP. Such transit peptides 
are known in the art. See, for example, Von Heijne et al. (1991) Plant Mol. Biol. Rep. 9: 104- 
126; Clark et al (1989) J. Biol. Chem. 264:17544-17550; della-Cioppa et al. (1987) Plant 
Physiol. 84:965-968; Romer et al. (1993) Biochem. Biophys. Res Commun. 196: 1414-1421; and, 
Shah et al. (1986) Science 233:478-481. 

Depending upon the intended use, the constructs may contain the nucleic acid sequence 
which encodes the entire prenyltransferase protein, or a portion thereof. For example, where 
antisense inhibition of a given prenyltransferase protein is desired, the entire prenyltransferase 
sequence is not required. Furthermore, where prenyltransferase sequences used in constructs are 
intended for use as probes, it may be advantageous to prepare constructs containing only a 
particular portion of a prenyltransferase encoding sequence, for example a sequence which is 
discovered to encode a highly conserved prenyltransferase region. 

The skilled artisan will recognize that there are various methods for the inhibition of 
expression of endogenous sequences in a host cell. Such methods include, but are not limited to, 
antisense suppression (Smith, et al. (1988) Nature 334:724-726) , co-suppression (Napoli, et al. 
(1989) Plant Cell 2:279-289), ribozymes (PCT Publication WO 97/10328), and combinations of 
sense and antisense Waterhouse, et al. (1998) Proc. Natl. Acad. Sci. USA 95:13959-13964. 
Methods for the suppression of endogenous sequences in a host cell typically employ the 
transcription or transcription and translation of at least a portion of the sequence to be 
suppressed. Such sequences may be homologous to coding as well as non-coding regions of the 
endogenous sequence. 

Regulatory transcript termination regions may be provided in plant expression constructs 
of this invention as well. Transcript termination regions may be provided by the DNA sequence 
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encoding the prenyltransferase or a convenient transcription termination region derived from a 
different gene source, for example, the transcript termination region which is naturally associated 
with the transcript initiation region. The skilled artisan will recognize that any convenient 
transcript termination region which is capable of terminating transcription in a plant cell may be 
5 employed in the constructs of the present invention. 

Alternatively, constructs may be prepared to direct the expression of the prenyltransferase 
sequences directly from the host plant cell plastid. Such constructs and methods are known in 
the art and are generally described, for example, inSvab, et al. (1990) Proc. Natl. Acad. Sci. USA 
87:8526-8530 and Svab and Maliga (1993) Proc. Natl. Acad. Sci. USA 90:913-917 and in U.S. 

1 0 Patent Number 5,693,507. 

The prenyltransferase constructs of the present invention can be used in transformation 
methods with additional constructs providing for the expression of other nucleic acid sequences 

11 encoding proteins involved in the production of tocopherols, or tocopherol precursors such as 
jr homogentisic acid and/or phytylpyrophosphate. Nucleic acid sequences encoding proteins 
% involved in the production of homogentisic acid are known in the art, and include but not are 

1 limited to, 4-hydroxyphenylpyruvatedioxygenase (HPPD, EC 1.13.11.27) described for example, 
™ by Garcia, et al. ((1999) Plant Physiol. 1 19(4):1507-1516), mono or bifunctional tyrA (described 
9 for example by Xia, et al. (1992) J. Gen Microbiol. 138:1309-1316, and Hudson, et al. (1984) J. 
If Mol. Biol. 180:1023-1051), Oxygenase, 4-hydroxyphenylpyruvatedi- (9CI), 4- 

jjp Hydroxyphenylpyruvate dioxygenase; p-Hydroxyphenylpyruvate dioxygenase; p- 
u Hydroxyphenylpyruvate hydroxylase; pilydroxyphenylpyruvate oxidase; p- 

Hydroxyphenylpyruvic acid hydroxylase; pflydroxyphenylpyruvic hydroxylase; p- 
Hydroxyphenylpyruvic oxidase), 4-hydroxyphenylacetate, NAD(P)H:oxygen oxidoreductase (1- 
hydroxylating); 4-hydroxyphenylacetate 1-monooxygenase, and the like. In addition, constructs 

2 5 for the expression of nucleic acid sequences encoding proteins involved in the production of 

phytylpyrophosphate can also be employed with the prenyltransferase constructs of the present 
invention. Nucleic acid sequences encoding proteins involved in the production of 
phytylpyrophosphate are known in the art, and include, but are not limited to 
geranylgeranylpyrophosphate synthase (GGPPS),geranylgeranylpyrophosphate reductase 

3 0 (GGH), l-deoxyxylulose-5-phosphate synthase, 1- deoxy-D-xylolose-5-phosphate 
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reductoisomerase, 4-diphosphocytidyl-2-C-methylerythritol synthase, isopentyl pyrophosphate 
isomerase. 

The prenyltransferase sequences of the present invention find use in the preparation of 
transformation constructs having a second expression cassette for the expression of additional 
sequences involved in tocopherol biosynthesis. Additional tocopherol biosynthesis sequences 
of interest in the present invention include, but are not limited to gamma-tocpherol 
methyltransferase (Shintani, etal. (1998) Science 282(5396):2098-2100), tocopherol cyclase, and 
tocopherol methyltransferase. 

A plant cell, tissue, organ, or plant into which the recombinant DNA constructs 
containing the expression constructs have been introduced is considered transformed, transfected, 
or transgenic. A transgenic or transformed cell or plant also includes progeny of the cell or plant 
and progeny produced from a breeding program employing such a transgenic plant as a parent in 
a cross and exhibiting an altered phenotype resulting from the presence of a prenyltransferase 
nucleic acid sequence. 

Plant expression or transcription constructs having a prenyltransferase as the DNA 
sequence of interest for increased or decreased expression thereof may be employed with a wide 
variety of plant life, particularly, plant life involved in the production of vegetable oils for edible 
and industrial uses. Particularly preferred plants for use in the methods of the present invention 
include, but are not limited to: Acacia, alfalfa, aneth, apple, apricot, artichoke, arugula, 
asparagus, avocado, banana, barley, beans, beet, blackberry, blueberry, broccoli,brussels sprouts, 
cabbage, canola, cantaloupe, carrot, cassava, cauliflower, celery, cherry, chicory, cilantro, citrus, 
Clementines, coffee, corn, cotton, cucumber, Douglas fir, eggplant, endive, escarole, eucalyptus, 
fennel, figs, garlic, gourd, grape, grapefruit, honey dewjicama, kiwifruit, lettuce, leeks, lemon, 
lime, Loblolly pine, mango, melon, mushroom, nectarine, nut, oat, oil palm, oil seed rape, okra, 
onion, orange, an ornamental plant, papaya, parsley, pea, peach, peanut, pear, pepper, 
persimmon, pine, pineapple, plantain, plum, pomegranate, poplar, potato, pumpkin, quince, 
radiata pine, radicchio, radish, raspberry, rice, rye, sorghum, Southern pine, soybean, spinach, 
squash, strawberry, sugarbeet, sugarcane, sunflower, sweet potato, sweetgum, tangerine, tea, 
tobacco, tomato, triticale, turf, turnip, a vine, watermelon, wheat, yams, and zucchini. 
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Most especially preferred are temperate oilseed crops. Temperate oilseed crops of interest 
include, but are not limited to, rapeseed (Canola and High Erucic Acid varieties), sunflower, 
safflower, cotton, soybean, peanut, coconut and oil palms, and corn. Depending on the method 
for introducing the recombinant constructs into the host cell, other DNA sequences may be 
required. Importantly, this invention is applicable to dicotyledyons and monocotyledons species 
alike and will be readily applicable to new and/or improved transformation and regulation 
techniques. 

Of particular interest, is the use of prenyltransferase constructs in plants to produce plants 
or plant parts, including, but not limited to leaves, stems, roots, reproductive, and seed, with a 
modified content of tocopherols in plant parts having transformed plant cells. 

For immunological screening, antibodies to the protein can be prepared by injecting 
rabbits or mice with the purified protein or portion thereof, such methods of preparing antibodies 
being well known to those in the art. Either monoclonal or polyclonal antibodies can be 
produced, although typically polyclonal antibodies are more useful for gene isolation. Western 
analysis may be conducted to determine that a related protein is present in a crude extract of the 
desired plant species, as determined by cross-reaction with the antibodies to the encoded 
proteins. When cross-reactivity is observed, genes encoding the related proteins are isolated by 
screening expression libraries representing the desired plant species. Expression libraries can be 
constructed in a variety of commercially available vectors, including lambda gtl 1, as described in 
Sambrook, et aL (Molecular Cloning: A Laboratory Manual, Second Edition (1989) Cold Spring 
Harbor Laboratory, Cold Spring Harbor, New York). 

To confirm the activity and specificity of the proteins encoded by the identified nucleic 
acid sequences as prenyltransferase enzymes, in vitro assays are performed in insect cell cultures 
using baculovirus expression systems. Such baculovirus expression systems are known in the art 
and are described by Lee, et aL U.S. Patent Number 5,348,886, the entirety of which is herein 
incorporated by reference. 

In addition, other expression constructs may be prepared to assay for protein activity 
utilizing different expression systems. Such expression constructs are transformed into yeast or 
prokaryotic host and assayed for prenyltransferase activity. Such expression systems are known 
in the art and are readily available through commercial sources. 
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In addition to the sequences described in the present invention, DNA coding sequences 
useful in the present invention can be derived from algae, fungi, bacteria, mammalian sources, 
plants, etc. Homology searches in existing databases using signature sequences corresponding 
to conserved nucleotide and amino acid sequences of prenyltransferase can be employed to 

5 isolate equivalent, related genes from other sources such as plants and microorganisms. 
Searches in EST databases can also be employed. Furthermore, the use of DNA sequences 
encoding enzymes functionally enzymatically equivalent to those disclosed herein, wherein such 
DNA sequences are degenerate equivalents of the nucleic acid sequences disclosed herein in 
accordance with the degeneracy of the genetic code, is also encompassed by the present 

1 0 invention. Demonstration of the functionality of coding sequences identified by any of these 
methods can be carried out by complementation of mutants of appropriate organisms, such as 
Synechocystis, Shewanella, yeast, Pseudomonas, Rhodobacteria, etc., that lack specific 

if biochemical reactions, or that have been mutated. The sequences of the DNA coding regions 

1= can be optimized by gene resynthesis, based on codon usage, for maximum expression in 

jj| particular hosts. 

45 For the alteration of tocopherol production in a host cell, a second expression construct 

'T can be used in accordance with the present invention. For example, the prenyltransferase 

expression construct can be introduced into a host cell in conjunction with a second expression 
M* construct having a nucleotide sequence for a protein involved in tocopherol biosynthesis, 
jp The method of transformation in obtaining such transgenic plants is not critical to the 

y instant invention, and various methods of plant transformation are currently available. 

Furthermore, as newer methods become available to transform crops, they may also be directly 
applied hereunder. For example, many plant species naturally susceptible to Agrobacterium 
infection may be successfully transformed via tripartite or binary vector methods of 
2 5 Agrobacterium mediated transformation. In many instances, it will be desirable to have the 

construct bordered on one or both sides by T-DNA, particularly having the left and right borders, 
more particularly the right border. This is particularly useful when the construct usesA. 
tumefaciens or A. rhizogenes as a mode for transformation, although the T-DNA borders may 
find use with other modes of transformation. In addition, techniques of microinjection, DNA 
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particle bombardment, and electroporation have been developed which allow for the 
transformation of various monocot and dicot plant species. 

Normally, included with the DNA construct will be a structural gene having the necessary 
regulatory regions for expression in a host and providing for selection of transformant cells. The 
gene may provide for resistance to a cytotoxic agent, e.g. antibiotic, heavy metal, toxin, etc., 
complementation providing prototrophy to an auxotrophic host, viral immunity or the like. 
Depending upon the number of different host species the expression construct or components 
thereof are introduced, one or more markers may be employed, where different conditions for 
selection are used for the different hosts. 

Where Agrobacterium is used for plant cell transformation, a vector may be used which 
may be introduced into the Agrobacterium host for homologous recombination with T-DNA or 
the Ti- or Ri-plasmid present in the Agrobacterium host. The Ti- or Ri-plasmid containing the T- 
DN A for recombination may be armed (capable of causing gall formation) or disarmed 
(incapable of causing gall formation), the latter being permissible, so long as the vir genes are 
present in the transformed Agrobacterium host. The armed plasmid can give a mixture of normal 
plant cells and gall. 

In some instances where Agrobacterium is used as the vehicle for transforming host plant 
cells, the expression or transcription construct bordered by the T-DNA border region(s) will be 
inserted into a broad host range vector capable of replication in E. coli and Agrobacterium, there 
being broad host range vectors described in the literature. Commonly used is pRK2 or 
derivatives thereof. See, for example, Ditta, etal, (Proc. Nat. Acad. ScL, U.S.A. (1980) 
77:7347-7351) and EPA 0 120 515, which are incorporated herein by reference. Alternatively, 
one may insert the sequences to be expressed in plant cells into a vector containing separate 
replication sequences, one of which stabilizes the vector in E. coli, and the other in 
Agrobacterium. See, for example, McBride, et al. (Plant MoL Biol. (1990) 74:269-276), wherein 
the pRiHRI (Jouanin, etal, MoL Gen. Genet. (1985) 201:370-374) origin of replication is 
utilized and provides for added stability of the plant expression vectors in host Agrobacterium 
cells. 

Included with the expression construct and the T-DNA will be one or more markers, 
which allow for selection of transformed Agrobacterium and transformed plant cells. A 
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number of markers have been developed for use with plant cells, such as resistance to 
chloramphenicol, kanamycin, the aminoglycoside G418, hygromycin, or the like. The particular 
marker employed is not essential to this invention, one or another marker being preferred 
depending on the particular host and the manner of construction. 

For transformation of plant cells using Agrobacterium, explants may be combined and 
incubated with the transformed Agrobacterium for sufficient time for transformation, the bacteria 
killed, and the plant cells cultured in an appropriate selective medium. Once callus forms, shoot 
formation can be encouraged by employing the appropriate plant hormones in accordance with 
known methods and the shoots transferred to rooting medium for regeneration of plants. The 
plants may then be grown to seed and the seed used to establish repetitive generations and for 
isolation of vegetable oils. 

There are several possible ways to obtain the plant cells of this invention which contain 
multiple expression constructs. Any means for producing a plant comprising a construct having 
a DNA sequence encoding the expression construct of the present invention, and at least one 
other construct having another DNA sequence encoding an enzyme are encompassed by the 
present invention. For example, the expression construct can be used to transform a plant at the 
same time as the second construct either by inclusion of both expression constructs in a single 
transformation vector or by using separate vectors, each of which express desired genes. The 
second construct can be introduced into a plant which has already been transformed with the 
prenyltransferase expression construct, or alternatively, transformed plants, one expressing the 
prenyltransferase construct and one expressing the second construct, can be crossed to bring the 
constructs together in the same plant. 

The nucleic acid sequences of the present invention can be used in constructs to provide 
for the expression of the sequence in a variety of host cells, both prokaryotic eukaryotic. Host 
cells of the present invention preferably include monocotyledenous and dicotyledenous plant 
cells. 

In general, the skilled artisan is familiar with the standard resource materials which 
describe specific conditions and procedures for the construction, manipulation and isolation of 
macromolecules (e.g., DNA molecules, plasmids, etc.), generation of recombinant organisms and 
the screening and isolating of clones, (see for example, Sambrook et al, Molecular Cloning: A 
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Laboratory Manual, Cold Spring Harbor Press (1989); Maliga et al, Methods in Plant 
Molecular Biology, Cold Spring Harbor Press (1995), the entirety of which is herein incorporated 
by reference; Birren et al, Genome Analysis: Analyzing DNA, 1, Cold Spring Harbor, New York, 
the entirety of which is herein incorporated by reference). 

Methods for the expression of sequences in insect host cells are known in the art. 
Baculovirus expression vectors are recombinant insect viruses in which the coding sequence for a 
chosen foreign gene has been inserted behind a baculovirus promoter in place of the viral gene, 
e.g., polyhedrin (Smith and Summers, U.S. Pat. No., 4,745,051, the entirety of which is 
incorporated herein by reference). Baculovirus expression vectors are known in the art, and are 
described for example in Doerfler, Curr. Top. Microbiol Immunol. 757:51-68 (1968); Luckow 
and Summers, Bio/Technology 6:47-55 (1988a); Miller, Annual Review of Microbiol. 42: 177-199 
(1988); Summers, Curr. Comm. Molecular Biology, Cold Spring Harbor Press, Cold Spring 
Harbor, N.Y. (1988); Summers and Smith, A Manual of Methods for Baculovirus Vectors and 
Insect Cell Culture Procedures, Texas Ag. Exper. Station Bulletin No. 1555 (1988), the 
entireties of which is herein incorporated by reference) 

Methods for the expression of a nucleic acid sequence of interest in a fungal host cell are 
known in the art. The fungal host cell may, for example, be a yeast cell or a filamentous fungal 
cell. Methods for the expression of DNA sequences of interest in yeast cells are generally 
described in "Guide to yeast genetics and molecular biology", Guthrie and Fink, eds. Methods in 
enzymology , Academic Press, Inc. Vol 194 (1991) and Gene expression technology", Goeddel 
ed, Methods in Enzymology, Academic Press, Inc., Vol 185 (1991). 

Mammalian cell lines available as hosts for expression are known in the art and include 
many immortalized cell lines available from the American Type Culture Collection (ATCC, 
Manassas, VA), such as HeLa cells, Chinese hamster ovary (CHO) cells, baby hamster kidney 
(BHK) cells and a number of other cell lines. Suitable promoters for mammalian cells are also 
known in the art and include, but are not limited to, viral promoters such as that from Simian 
Virus 40 (SV40) (Fiers et al, Nature 273:113 (1978), the entirety of which is herein incorporated 
by reference), Rous sarcoma virus (RSV), adenovirus (ADV) and bovine papilloma virus (BPV). 
Mammalian cells may also require terminator sequences andpoly-A addition sequences. 
Enhancer sequences which increase expression may also be included and sequences which 
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promote amplification of the gene may also be desirable (for example methotrexate resistance 
genes). 

Vectors suitable for replication in mammalian cells are well known in the art, and may 
include viral replicons, or sequences which insure integration of the appropriate sequences 
encoding epitopes into the host genome. Plasmid vectors that greatly facilitate the construction of 
recombinant viruses have been described (see, for example, Mackett et al, J Virol. 49:851 
(1984); Chakrabarti et al, Mol Cell. Biol 5:3403 (1985); Moss, In: Gene Transfer Vectors For 
Mammalian Cells (Miller and Calos, eds., Cold Spring Harbor Laboratory, N.Y., p. 10, (1987); 
all of which are herein incorporated by reference in their entirety). 

The invention now being generally described, it will be more readily understood by 
reference to the following examples which are included for purposes of illustration only and are 
not intended to limit the present invention. 

EXAMPLES 

Example 1: Identification of Prenyltransferase Sequences 

PSI-BLAST (Altschul, et al. (1997) Nuc Acid Res 25:3389-3402) profiles were generated 
for both the straight chain and aromatic classes of prenyltransferases. To generate the straight 
chain profile, a prenyl- transferase from Porphyra purpurea (Genbank accession 1709766) was 
used as a query against the NCBI non-redundant protein database. The£. coli enzyme involved 
in the formation of ubiquinone, ubiA (genbank accession 1790473) was used as a starting 
sequence to generate the aromatic prenyltransferase profile. These profiles were used to search 
public and proprietary DNA and protein data bases. InArabidopsis seven putative 
prenyltransferases of the straight-chain class were identified, ATPT1, (SEQ ID NO:9), ATPT7 
(SEQ ID NO: 10), ATPT8 (SEQ ID NO: 1 1), ATPT9 (SEQ ID NO: 13), ATPT10 (SEQ ID 
NO: 14), ATPT1 1 (SEQ ID NO: 15), and ATPT12 (SEQ ID NO: 16) and five were identified of 
the aromatic class, ATPT2 (SEQ ID NO:l), ATPT3 (SEQ ID NO:3), ATPT4 (SEQ ID NO:5), 
ATPT5 (SEQ ID NO:7), ATPT6 (SEQ ID NO:8). Additional prenyltransferase sequences from 
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other plants related to the aromatic class of prenyltransf erases, such as soy (SEQ IDNOs: 19-23, 
the deduced amino acid sequence of SEQ ID NO:23 is provided in SEQ ID NO:24) and maize 
(SEQ ED NOs:25-29, and 3 1) are also identified. The deduced amino acid sequence of ZMPT5 
(SEQ ID NO:29) is provided in SEQ ID NO:30. 
5 Searches are performed on a Silicon Graphics Unix computer using additional 

Bioaccellerator hardware and GenWeb software supplied by Compugen Ltd. This software and 
hardware enables the use of the Smith-Waterman algorithm in searching DNA and protein 
databases using profiles as queries. The program used to query protein databases is profilesearch. 
This is a search where the query is not a single sequence but a profile based on a multiple 
1 0 alignment of amino acid or nucleic acid sequences. The profile is used to query a sequence data 
set, i.e., a sequence database. The profile contains all the pertinent information for scoring each 
position in a sequence, in effect replacing the "scoring matrix" used for the standard query 
S searches. The program used to query nucleotide databases with a protein profile is tprofilesearch. 
jz Tprofilesearch searches nucleic acid databases using an amino acid profile query. As the search is 

running, sequences in the database are translated to amino acid sequences in six reading frames. 
S The output file for tprofilesearch is identical to the output file for profilesearch except for an 
T additional column that indicates the frame in which the best alignment occurred. 
5; The Smith-Waterman algorithm, (Smith and Waterman ( 198 1) supra), is used to search 

M- for similarities between one sequence from the query and a group of sequences contained in the 
£) database. E score values as well as other sequence information, such as conserved peptide 
u sequences are used to identify related sequences. 

To obtain the entire coding region corresponding to the Arabidopsis prenyltransferase 
sequences, synthetic oligo-nucleotide primers are designed to amplify the 5' and 3' ends of 
partial cDNA clones containing prenyltransferase sequences. Primers are designed according to 
2 5 the respective Arabidopsis prenyltransferase sequences and used in Rapid Amplification of 

cDNA Ends (RACE) reactions (Frohman et al. (1988) Proc. Natl. Acad. Sci. USA 85:8998-9002) 
using the Marathon cDNA amplification kit (Clontech Laboratories Inc, Palo Alto, CA). 

Additional BLAST searches are performed using the ATPT2 sequence, a sequence in the 
class of aromatic prenyl transferases. Additional sequences are identified in soybean libraries 
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that are similar to the ATPT2 sequence. The additional soybean sequence demonstrates 80% 
identity and 91% similarity at the amino acid sequence. 

Amino acid sequence alignments between ATPT2 (SEQ ID NO:2), ATPT3 (SEQ ID 
NO:4), ATPT4 (SEQ ID NO:6), ATPT8 (SEQ ID NO: 12), and ATPT12 (SEQ ID NO: 17) are 
5 performed using ClustalW (Figure 1), and the percent identity and similarities are provided in 
Table 1 below. 
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Example 2: Preparation of Expression Constructs 



A plasmid containing the napin cassette derived from pCGN3223 (described in USPN 
5,639,790, the entirety of which is incorporated herein by reference) was modified to make it 
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more useful for cloning large DN A fragments containing multiple restriction sites, and to allow 
the cloning of multiple napin fusion genes into plant binary transformation vectors. An adapter 
comprised of the self annealed oligonucleotide of sequence 

CGCGATTTAAATGGCGCGCCCTGCAGGCGGCCGCCTGCAGGGCGCGCCATTTAAAT 
5 (SEQ ID NO:40) was ligated into the cloning vector pBC SK+ (Stratagene) after digestion with 
the restriction endonuclease BssHH to construct vector pCGN7765. Plamids pCGN3223 and 
pCGN7765 were digested with NotI and ligated together. The resultant vector, pCGN7770, 
contains the pCGN7765 backbone with the napin seed specific expression cassette from 
pCGN3223. 

1 0 The cloning cassette, pCGN7787, essentially the same regulatory elements as 

pCGN7770, with the exception of the napin regulatory regions of pCGN7770 have been replaced 
with the double CAMV 35S promoter and the tail polyadenylation and transcriptional 

i.jO termination region. 

1; A binary vector for plant transformation, pCGN5 1 39, was constructed from pCGN 1558 

js (McBride and Summerfelt, (1990) Plant Molecular Biology, 14:269-276). The polylinker of 
S pCGN 1 558 was replaced as a Hindm/Asp7 1 8 fragment with apolylinker containing unique 
m restriction endonuclease sites, AscI, Pad, Xbal, Swal, BamHI, and NotI. The Asp7 18 and 
H Hindin restriction endonuclease sites are retained in pCGN5 139. 

^ A series of turbo binary vectors are constructed to allow for the rapid cloning of DNA 

gp sequences into binary vectors containing transcriptional initiation regions (promoters) and 
Q transcriptional termination regions. 

The plasmid pCGN8618 was constructed by ligating oligonucleotides 5'- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3' (SEQ ID NO:41) and 5'- 
TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC-3' (SEQ ID NO:42) into Sall/Xhol- 

2 5 digested pCGN7770. A fragment containing the napin promoter, polylinker and napin 3 ' region 

was excised from pCGN8618 by digestion with Asp718I; the fragment was blunt-ended by filling 
in the 5' overhangs with Klenow fragment then ligated into pCGN5139 that had been digested 
with Asp718I and HindEI and blunt-ended by filling in the 5' overhangs withKlenow fragment. 
A plasmid containing the insert oriented so that the napin promoter was closest to the blunted 

3 0 Asp718I site of pCGN5139 and the napin 3' was closest to the blunted Hindin site was subjected 
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to sequence analysis to confirm both the insert orientation and the integrity of cloning junctions. 
The resulting plasmid was designated pCGN8622. 

The plasmid pCGN8619 was constructed by ligating oligonucleotides 5'- 
TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC -3' (SEQ ID NO:43) and 5'- 
5 TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3' (SEQ ID NO:44) into Sall/Xhol- 

digested pCGN7770. A fragment containing the napin promoter, polylinker and napin 3' region 
was removed from pCGN8619 by digestion with Asp718I; the fragment was blunt-ended by 
filling in the 5' overhangs with Klenow fragment then ligated into pCGN5139 that had been 
digested with Asp718I and Hindlll and blunt-ended by filling in the 5' overhangs with Klenow 
1 0 fragment. A plasmid containing the insert oriented so that the napin promoter was closest to the 
blunted Asp718I site of pCGN5139 and the napin 3' was closest to the blunted Hindlll site was 
^ subjected to sequence analysis to confirm both the insert orientation and the integrity of cloning 
*J2 junctions. The resulting plasmid was designated pCGN8623. 
j* The plasmid pCGN8620 was constructed by ligating oligonucleotides 5'- 

|b TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGGAGCT -3' (SEQ ID NO:45) and 5'- 
^ CCTGC AGGAAGCTTGCGGCCGCGGATCC-3 ' (SEQ ID NO:46) into Sall/SacI-digested 
I pCGN7787. A fragment containing the d35S promoter, polylinker and tml 3' region was 
% removed from pCGN8620 by complete digestion with Asp718I and partial digestion with Not! 
H The fragment was blunt-ended by filling in the 5' overhangs with Klenow fragment then ligated 
§P into pCGN5 139 that had been digested with Asp718I and Hindlll and blunt-ended by filling in 
^ the 5' overhangs with Klenow fragment. A plasmid containing the insert oriented so that the 

d35S promoter was closest to the blunted Asp718I site of pCGN5139 and the tml 3' was closest 
to the blunted Hindin site was subjected to sequence analysis to confirm both the insert 
orientation and the integrity of cloning junctions. The resulting plasmid was designated 
25 pCGN8624. 

The plasmid pCGN8621 was constructed by ligating oligonucleotides 5'- 
TCGACCTGCAGGAAGCTTGCGGCCGCGGATCCAGCT -3' (SEQ ID NO:47) and 5 ? - 
GGATCCGCGGCCGC AAGCTTCCTGCAGG-3 ' (SEQ ID NO:48) into Sall/SacI-digested 
pCGN7787. A fragment containing the d35S promoter, polylinker and tml 3' region was 
3 0 removed from pCGN862 1 by complete digestion with Asp7 1 81 and partial digestion with NotL 
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The fragment was blunt-ended by filling in the 5' overhangs withKlenow fragment then ligated 
into pCGN5139 that had been digested with Asp718I and Hindlll and blunt-ended by filling in 
the 5' overhangs with Klenow fragment. A plasmid containing the insert oriented so that the 
d35S promoter was closest to the blunted Asp718I site of pCGN5139 and the tail 3' was closest 
5 to the blunted Hindlll site was subjected to sequence analysis to confirm both the insert 
orientation and the integrity of cloning junctions. The resulting plasmid was designated 
pCGN8625. 

The plasmid construct pCGN8640 is a modification of pCGN8624 described above. A 
938bp PstI fragment isolated from transposon Tn7 which encodes bacterial spectinomycin and 

1 0 streptomycin resistance (Fling et al. (1985), Nucleic Acids Research 13(19):7095-7 106), a 

determinant for E. coli and Agrobacterium selection, was blunt ended with Pfu polymerase. The 
blunt ended fragment was ligated into pCGN8624 that had been digested with Spel and blunt 
ended with Pfu polymerase. The region containing the PstI fragment was sequenced to confirm 

jz both the insert orientation and the integrity of cloning junctions. 

£5 The spectinomycin resistance marker was introduced into pCGN8622 and pCGN8623 as 

jj follows. A 7.7 Kbp Avrll-SnaBI fragment from pCGN8640 was ligated to a 10.9 Kbp Avrll- 

ft :: z 

e SnaBI fragment from pCGN8623 or pCGN8622, described above. The resulting plasmids were 
? li pCGN8641 and pCGN8643, respectively. 

The plasmid pCGN8644 was constructed by ligating oligonucleotides 5'- 
gp GATCACCTGCAGGAAGCTTGCGGCCGCGGATCCAATGCA-3' (SEQ ID NO:49) and 5'- 
p TTGGATCCGCGGCCGCAAGCTTCCTGCAGGT-3' (SEQ ID NO:50) into BamHI-PstI 
digested pCGN8640. 

Synthetic oligonulceotides were designed for use in Polymerase Chain Reactions (PCR) to 
amplify the coding sequences of ATPT2, ATPT3, ATPT4, ATPT8, and ATPT12 for the preparation 
25 of expression constructs and are provided in Table 2 below. 



Table 2: 



Name 


Restriction Site 


Sequence 


SEQ ID NO: 


ATPT2 


5' NotI 


GGATCCGCGGCCGCACAATGGAGTC 
TCTGCTCTCTAGTTCT 


51 


ATPT2 


3' Ssel 


GGATCCTGCAGGTCACTTCAAAAAA 
GGTAACAGCAAGT 


52 


ATPT3 


5' NotI 


GGATCCGCGGCCGCACAATGGCGTT 


53 
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TTTTGGGCTCTCCCGTGTTT 


ATPT3 


3 Ssel 


cm A TrrTGf A GGTT ATTG A A A ACTT 
CTTCCAAGTACAACT 


54 


ATPT4 


5 JNotl 


nn a TrrrwnGrrGC a r a ATGTGGCG 
AAGATCTGTTGTT 


55 


ATPT4 


3 Ssel 


GG A TGGTGG A GGTG ATGG AGAGT AG 
AAGGAAGGAGCT 


56 


ATPT8 


5 JNotl 


naATrrncGficrfiCACAATGGTACT 

V_J vJ/\ JL V^- VJ V^VJ VJ V^ V^ VJ v^Avriii a vj vjii *^ j. 

TGCCGAGGTTCCAAAGCTTGCCTCT 


57 


ATPTR 


3' Ssel 


GGATCCTGCAGGTCACTTGTTTCTG 
GTGATGACTCTAT 


58 


ATPT12 


5' NotI 


GGATCCGCGGCCGCACAATGACTTC 
GATTCTCAACACT 


59 


ATPT12 


3' Ssel 


GGATCCTGCAGGTCAGTGTTGCGAT 
GCTAATGCCGT 


60 



The coding sequences of ATPT2, ATPT3, ATPT4, ATPT8, and ATPT12 were all amplified 
using the respective PCR primers shown in Table 2 above and cloned into the TopoTA vector 
(Invitrogen). Constructs containing the respective prenyltransferase sequences were digested with 
NotI and Sse8387I and cloned into the turbobinary vectors described above. 

The sequence encoding ATPT2 prenyltransferase was cloned in the sense orientation into 
pCGN8640 to produce the plant transformation construct pCGN10800 (Figure 2). The ATPT2 
sequence is under control of the 35S promoter. 

The ATPT2 sequence was also cloned in the antisense orientation into the construct 
pCGN8641 to create pCGN10801 (Figure 3). This construct provides for the antisense expression of 
the ATPT2 sequence from the napin promoter. 

The ATPT2 coding sequence was also cloned in the antisense orientation into the vector 
pCGN8643 to create the plant transformation construct pCGN 10802 

The ATPT2 coding sequence was also cloned in the antisense orientation into the vector 
pCGN8644 to create the plant transformation construct pCGN 10803 (Figure 4). 

The ATPT4 coding sequence was cloned into the vector pCGN864 to create the plant 
transformation construct pCGN10806 (Figure 5). The ATPT2 coding sequence was cloned into the 
vector pCGN864 to create the plant transformation construct pCGN10807(Figure 6). The ATPT3 
coding sequence was cloned into the vector pCGN864 to create the plant transformation construct 
pCGN10808 (Figure 7). The ATPT3 coding sequence was cloned in the sense orientation into the 
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vector pCGN8640 to create the plant transformation construct pCGN10809 (Figure 8). The ATPT3 
coding sequence was cloned in the antisense orientation into the vector pCGN8641 to create the 
plant transformation construct pCGN10810 (Figure 9). The ATPT3 coding sequence was cloned into 
the vector pCGN8643 to create the plant transformation construct pCGN1081 1 (Figure 10). The 
ATPT3 coding sequence was cloned into the vector pCGN8640 to create the plant transformation 
construct pCGN10812 (Figure 11). The ATPT4 coding sequence was cloned into the vector 
P CGN8640 to create the plant transformation construct pCGN10813 (Figure 12). The ATPT4 coding 
sequence was cloned into the vector pCGN8643 to create the plant transformation construct 
pCGN10814 (Figure 13). The ATPT4 coding sequence was cloned into the vector pCGN8641 to 
create the plant transformation construct pCGN10815 (Figure 14). The ATPT4 coding sequence was 
cloned in the antisense orientation into the vector pCGN8644 to create the plant transformation 
construct pCGN10816 (Figure 15). The ATPT2 coding sequence was cloned into the vector 
pCGN???? to create the plant transformation construct pCGN10817 (Figure 16). The ATPT8 coding 
sequence was cloned in the sense orientation into the vector pCGN8643 to create the plant 
transformation construct pCGN10819 (Figure 17). The ATPT12 coding sequence was cloned into 
the vector pCGN8644 to create the plant transformation construct pCGN 10824 (Figure 18). The 
ATPT12 coding sequence was cloned into the vector pCGN8641 to create the plant transformation 
construct pCGN10825 (Figure 19). The ATPT8 coding sequence was cloned into the vector 
pCGN8644 to create the plant transformation construct pCGN10826 (Figure 20). 



Example 3: Plant Transformation 

Transgenic Brassica plants are obtained by Agrobacterium-mediated transformation a 
described by Radke et al. (Theor. Appl. Genet. (1988) 75:685-694; Plant Cell Reports (1992) 
11 :499-505). Transgenic Arabidopsis thaliana plants may be obtained by Agrobacterium- 
mediated transformation as described by Valverkens et al, (Proc. Nat. Acad. Sci. (1988) 
85:5536-5540), or as described by Bent et al. ((1994), Science 265:1856-1860), or Bechtold e 
((1993), CR.Acad.Sci, Life Sciences 316:1 194-1 199). Other plant species may be similarly 
transformed using related techniques. 
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Alternatively, microprojectile bombardment methods, such as described by Klein et al. 
{Bio/Technology 70:286-291) may also be used to obtain nuclear transformed plants. 



5 Example 4: Identification of Additional Prenyltransferases 

A PSI-Blast profile generated using the£. coli ubiA (genbank accession 1790473) 
sequence was used to analyze the Synechocystis genome. This analysis identified 5 open reading 
frames (ORFs) in the Synechocystis genome that were potentially prenyltransferases; slr0926 

1 0 (annotated as ubiA (4-hydroxybenzoate-octaprenyl transferase, SEQ ID NO:32), S111899 

(annotated as ctaB (cytocrome c oxidase folding protein, SEQ ID NO:33), slr0056 (annotated as 

^ g4 (chlorophyll synthase 33 kd subunit, SEQ ID NO:34), slrl518 (annotated as menA 

£ (menaquinone biosynthesis protein, SEQ ID NO:35), and sir 1736 ( annotated as a hypothetical 

jj protein of unknown function (SEQ ID NO:36). 

To determine the functionality of these ORFs and their involvement, if any, in the 

5 biosynthesis of Tocopherols, knockouts constructs were made to disrupt the ORF identified in 

g Synechocystis. 

? 3 Synthetic oligos were designed to amplify regions from the 5' (5'- 

U TAATGTGTACATTGTCGGCCTC (17365') (SEQ ID NO:61) and 5'- 

|o GCAATGTAACATCAGAGATTTTGAGACACAACGTGGCTTTCCACAATTCCCCGCACC 
° GTC (1736kanprl)) (SEQ ID NO:62) and 3' (5 ' -AGGCT A AT AAGC AC A A ATGGGA (17363') 
(SEQ ID NO:63) and 5 ' -GGT ATGAGTC AGC AAC ACCTTCTTG ACG AGGC AG ACCTC AGC 
GGAATTGGTTTAGGTTATCCC (1736kanpr2)) (SEQ ID NO:64) ends of the slrl736 ORF. 
The 1736kanprl and 1736kanpr2 oligos contained 20 bp of homology to the slrl736 ORF with 
25 an additional 40 bp of sequence homology to the ends of the kanamycin resistance cassette. 
Separate PCR steps were completed with these oligos and the products were gel purified and 
combined with the kanamycin resistance gene from puc4K (Pharmacia) that had been digested 
with Hindi and gel purified away from the vector backbone. The combined fragments were 
allowed to assemble without oligos under the following conditions: 94°C for 1 min, 55°C for 1 
3 0 min, 72°C for 1 min plus 5 seconds per cycle for 40 cycles using pfu polymerase in lOOul 
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reaction volume (Zhao, H and Arnold (1991) Nucleic Acids Res. 25(6): 1307-1308). One 
microliter or five microliters of this assembly reaction was then amplified using 5' and 3' oligos 
nested within the ends of the ORF fragment, so that the resulting product contained 100-200 bp 
of the 5' end of the Synechocystis gene to be knocked out, the kanamycin resistance cassette, and 
5 100-200 bp of the 3' end of the gene to be knocked out. This PCR product was then cloned into 
the vector pGemT easy (Promega) to create the construct pMON21681 and used for 
Synechocystis transformation. 

Primers were also synthesized for the preparation of Synechocystis knockout constructs 
for the other sequences using the same method as described above, with the following primers. 

10 The ubiA 5' sequence was amplified using the primers 5'- GGATCCATGGTT 
GCCCAAACCCCATC (SEQ ID NO:65) and 5'- GCAATGTAACATCAGAGA 
TTTTGAGACACAACG TGGCTTTGGGTAAGCAACAATGACCGGC (SEQ ID NO:66). 

5 The 3' region was amplified using the synthetic oligonucleotide primers 5'- 

1 1 GAATTCTCAAAGCCAGCCCAGTAAC (SEQ ID NO:67) and 5 ' -GGT ATG AGTC 

ft AGCAACACCTTCTTCACGAGGCAGACCTCAGCGGGTGCGAAAAGGGTTTTCCC (SEQ 

% ID NO:68). The amplification products were combined with the kanamycin resistance gene from 
m puc4K (Pharmacia) that had been digested with Hindi and gel purified away from the vector 
O backbone. The annealed fragment was amplified using 5' and 3' oligos nested within the ends of 
5 the ORF fragment (5'- CCAGTGGTTTAGGCTGTGTGGTC (SEQ ID NO:69) and 5'- 

CTGAGTTGGATGTATTGGATC (SEQ ID NO:70)), so that the resulting product contained 
3 100-200 bp of the 5' end of the Synechocystis gene to be knocked out, the kanamycin resistance 
cassette, and 100-200 bp of the 3' end of the gene to be knocked out. This PCR product was then 
cloned into the vector pGemT easy (Promega) to create the construct pMON21682 and used for 
Synechocystis transformation. 
2 5 Primers were also synthesized for the preparation of Synechocystis knockout constructs 

for the other sequences using the same method as described above, with the following primers. 
The si 1 1899 5' sequence was amplified using the primers 5'- GGATCCATGGTT ACTT 
CGACAAAAATCC (SEQ ID NO:71) and 5'- GCAATGTAACATCAGAG 
ATTTTGAGACACAACGTGGCTTTGCTAGGCAACCGCTTAGTAC (SEQ ID NO:72). The 

30 3' region was amplified using the synthetic oligonucleotide primers 5'- 
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GAATTCTTAACCCAACAGTAAAGTTCCC (SEQ ID NO:73) and 5 - GGTATG AGTC AGC 
AACACCTTCTTCACGAGGCAGACCTCAGCGCCGGCATTGTCTTTTACATG (SEQ ID 
NO:74). The amplification products were combined with the kanamycin resistance gene from 
puc4K (Pharmacia) that had been digested with HincIL and gel purified away from the vector 
5 backbone. The annealed fragment was amplified using 5' and 3' oligos nested within the ends of 
the ORF fragment (5'- GGAACCCTTGCAGCCGCTTC (SEQ ID NO:75) 
and 5'- GTATGCCCAACTGGTGCAGAGG (SEQ ID NO:?6)), so that the resulting product 
contained 100-200 bp of the 5' end of the Synechocystis gene to be knocked out, the kanamycin 
resistance cassette, and 100-200 bp of the 3' end of the gene to be knocked out. This PCR 
1 0 product was then cloned into the vector pGemT easy (Promega) to create the construct 
pMON21679 and used for Synechocystis transformation. 

Primers were also synthesized for the preparation of Synechocystis knockout constructs 
3 for the other sequences using the same method as described above, with the following primers. 
% the slr0056 5 ' sequence was amplified using the primers 5 ' - 
;|k GGATCCATGTCTGACACACAAAATACCG (SEQ ID NO:77) and 5'- 
1 GCAATGTAACATCAGAGATTTTGAGACACAACGTGGCTTTCGCCAATACCAGCCACC 

& AACAG (SEQ ID NO:78). The 3' region was amplified using the synthetic oligonucleotide 

Z primers 5'- GAATTCTCAAAT CCCCGCATGGCCTAG (SEQ ID NO:79) and 5'- 

S GGTATGAGTCAGCAACACCTTCTTCACGAGGCAGACCTCAGCGGCCTACGGCTTGGA 

So CGTGTGGG (SEQ ID NO:80). The amplification products were combined with the kanamycin 
3 resistance gene from puc4K (Pharmacia) that had been digested with HincJl and gel purified 
away from the vector backbone. The annealed fragment was amplified using 5' and 3' oligos 
nested within the ends of the ORF fragment (5'- CACTTGGATTCCCCTGATCTG (SEQ ID 
NO:81) and 5'- GCAATACCCGCTTGGAAAACG (SEQ ID NO:82)), so that the resulting 

2 5 product contained 100-200 bp of the 5' end of the Synechocystis gene to be knocked out, the 

kanamycin resistance cassette, and 100-200 bp of the 3' end of the gene to be knocked out. This 
PCR product was then cloned into the vector pGemT easy (Promega) to create the construct 
pMON21677 and used for Synechocystis transformation. 

Primers were also synthesized for the preparation of Synechocystis knockout constructs 

3 0 for the other sequences using the same method as described above, with the following primers. 
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The slrl518 5' sequence was amplified using the primers 5'- GGATCCATGACCGAAT 
CTTCGCCCCTAGC (SEQ ID NO:83) and 5'-GCAATGTAACATCAGAGATTTTGA 
GACACAACGTGGC TTTCAATCCTAGGTAGCCGAGGCG (SEQ ID NO:84). The 3' region 
was amplified using the synthetic oligonucleotide primers 5'- GAATTCTTAGCCCAGGCC 
5 AGCCCAGCC (SEQ ID NO:85)and 5'- GGTATGAGTCAGCAACACCTTCTTCACGA 
GGCAGACCTCAGCGGGGAATTGATTTGTTTAATTACC (SEQ ID NO:86). The 
amplification products were combined with the kanamycin resistance gene from puc4K 
(Pharmacia) that had been digested with Hindi and gel purified away from the vector backbone. 
The annealed fragment was amplified using 5' and 3' oligos nested within the ends of the ORF 
10 fragment (5'- GCGATCGCCATTATCGCTTGG (SEQ ID NO:87) and 5'- 

GCAGACTGGCAATTATCAGTAACG (SEQ ID NO:88)), so that the resulting product 
p. contained 100-200 bp of the 5' end of the Synechocystis gene to be knocked out, the kanamycin 
tO resistance cassette, and 100-200 bp of the 3' end of the gene to be knocked out. This PCR 
,|r product was then cloned into the vector pGemT easy (Promega) to create the construct 
1$ pMON2 1 680 and used for Synechocystis transformation. 

s B. Transformation of Synechocystis 

J Cells of Synechocystis 6803 were grown to a density of approximately 2x1 (f cells per ml 

and harvested by centrifugation. The cell pellet was re-suspended in fresh BG-1 1 medium 

go (ATCC Medium 616) at a density of lxlO 9 cells per ml and used immediately for transformation. 
One-hundred microliters of these cells were mixed with 5 ul of mini prep DNA and incubated 
with light at 30C for 4 hours. This mixture was then plated onto nylon filters resting on BG-1 1 
agar supplemented with TES pH8 and allowed to grow for 12-18 hours. The filters were then 
transferred to BG-1 1 agar + TES + 5ug/ml kanamycin and allowed to grow until colonies 

25 appeared within 7-10 days (Packer and Glazer, 1988). Colonies were then picked into BG-1 1 
liquid media containing 5 ug/ml kanamycin and allowed to grow for 5 days. These cells were 
then transferred to Bg-1 1 media containing lOug/ml kanamycin and allowed to grow for 5 days 
and then transferred to Bg-1 1 + kanamycin at 25ug/ml and allowed to grow for 5 days. Cells 
were then harvested for PCR analysis to determine the presence of a disrupted ORF and also for 

3 0 HPLC analysis to determine if the disruption had any effect on tocopherol levels. 
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PCR analysis of the Synechocystis isolates for sir 1736 and sill 899 showed complete 
segregation of the mutant genome, meaning no copies of the wild type genome could be detected 
in these strains. This suggests that function of the native gene is not essential for cell function. 
HPLC analysis of these same isolates showed that the sill 899 strain had no detectable reduction 
in tocopherol levels. However, the strain carrying the knockout for sir 1736 produced no 
detectable levels of tocopherol. 

The amino acid sequences for the Synechocystis knockouts are compared using ClustalW, 
and are provided in Table 3 below. Provided are the percent identities, percent similarity, and the 
percent gap. The alignment of the sequences is provided in Figure 21. 



Table 3: 





Slrl736 


slr0926 


S111899 


slr0056 


sir 15 18 


slrl736 %identity 




14 


12 


18 


11 


%similar 




29 


30 


34 


26 


%gap 




8 


7 


10 


5 


slr0926 %identity 






20 


19 


14 


%similar 






39 


32 


28 


%gap 






7 


9 


4 


sill 899 %identity 








17 


13 


%similar 








29 


29 


%gap 








12 


9 


slr0056 %identity 










15 


%similar 










31 


%gap 










8 


slrl518 %identity 












%similar 












%gap 













Amino acid sequence comparisons are performed using various Arabidopsis 
prenyltransferase sequences and the Synechocystis sequences. The comparisons are presented in 
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Table 4 below. Provided are the percent identities, percent similarity, and the percent gap. 
alignment of the sequences is provided in Figure 22. 



Table 4: 





ATPT2 slrl736 ATPT3 


slr0926 


ATPT4 


slll899 


ATPT12 


slr0056 


ATPT8 


slrlSlS 


ATPT2 


29 9 


9 


8 


8 


12 


9 


7 


9 




46 23 


21 


20 


20 


28 


23 


21 


20 




27 13 


28 


23 


29 


11 


24 


25 


24 


slrl736 


9 


13 


8 


12 


13 


15 


8 


10 




19 


28 


19 


28 


26 


33 


21 


26 




34 


12 


34 


15 


26 


10 


12 


10 


ATPT3 




23 


11 


14 


13 


10 


5 


11 






36 


26 


26 


26 


21 


14 


22 






29 


21 


31 


16 


30 


30 


30 








12 


20 


17 


20 


11 


14 


slr0926 






24 


37 


28 


33 


24 


29 








33 


12 


25 


10 


11 


9 










18 


11 


8 


6 


7 


ATPT4 








33 


23 


18 


16 


19 










28 


19 


32 


32 


33 












13 


17 


10 


12 


S111899 










24 


30 


23 


26 












27 


13 


10 


11 














52 


8 


11 


ATPT1 












66 


19 


26 


2 






























18 


25 


23 
















9 


13 


slr0O56 














23 


32 
















10 


8 
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■ 7 




23 


ATPT8 




7 


sin j io 





4B. Preparation of the slrl737 Knockout 

The Synechocystis sp. 6803 slrl737 knockout was constructed by the following method. 
e§ The GPS™-1 Genome Priming System (New England Biolabs) was used to insert, by a Tn7 
5 Transposase system, a Kanamycin resistance cassette into slrl 737. A plasmid from a 
I Synechocystis genomic library clone containing 652 base pairs of the targeted orf (Synechcocystis 
1 genome base pairs 1324051 - 1324703; the predicted orf base pairs 1323672 - 1324763, as 
? annotated by Cyanobase) was used as target DNA. The reaction was performed according to the 
€0 manufacturers protocol. The reaction mixture was then transformed into E. coli DH10B 
1 electrocompetant cells and plated. Colonies from this transformation were then screened for 
P transposon insertions into the target sequence by amplifying with M13 Forward and Reverse 
S Universal primers, yielding a product of 652 base pairs plus -1700 base pairs, the size of the 
° transposon kanamycin cassette, for a total fragment size of -2300 base pairs. After this 
15 determination, it was then necessary to determine the approximate location of the insertion 

within the targeted orf, as 100 base pairs of orf sequence was estimated as necessary for efficient 
homologous recombination in Synechocystis. This was accomplished through amplification 
reactions using either of the primers to the ends of the transposon, Primer S (5' end) or N (3' 
end), in combination with either a M13 Forward or Reverse primer. That is, four different primer 
2 0 combinations were used to map each potential knockout construct: Primer S - M13 Forward, 
Primer S - M13 Reverse, Primer N-M13 Forward, Primer N - M13 Reverse. The construct 
used to transform Synechocystis and knockout slrl737 was determined to consist of a 
approximately 150 base pairs of slrl737 sequence on the 5' side of the transposon insertion and 
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approximately 500 base pairs on the 3' side, with the transcription of the orf and kanamycin 
cassette in the same direction. The nucleic acid sequence of slrl737 is provided in SEQ ID 
NO:38 the deduced amino acid sequence is provided in SEQ ID NO:39. 

Cells of Synechocystis 6803 were grown to a density of ~ 2xl0 8 cells per ml and 
5 harvested by centrifugation. The cell pellet was re-suspended in fresh BG-1 1 medium at a 

density of lxlO 9 cells per ml and used immediately for transformation. 100 ul of these cells were 
mixed with 5 ul of mini prep DNA and incubated with light at 30C for 4 hours. This mixture 
was then plated onto nylon filters resting on BG-1 1 agar supplemented with TES ph8 and 
allowed to grow for 12-18 hours. The filters were then transferred to BG-1 1 agar + TES + 

1 0 5ug/ml kanamycin and allowed to grow until colonies appeared within 7-10 days (Packer and 
Glazer, 1988). Colonies were then picked into BG-1 1 liquid media containing 5 ug/ml 
kanamycin and allowed to grow for 5 days. These cells were then transferred to Bg-1 1 media 

& containing lOug/ml kanamycin and allowed to grow for 5 days and then transferred to Bg-1 1 + 

% kanamycin at 25ug/ml and allowed to grow for 5 days. Cells were then harvested for PCR 

analysis to determine the presence of a disrupted ORF and also for HPLC analysis to determine if 

i the disruption had any effect on tocopherol levels. 

m PCR analysis of the Synechocystis isolates, using primers to the ends of the slrl737 orf , 

3 showed complete segregation of the mutant genome, meaning no copies of the wild type genome 
C could be detected in these strains. This suggests that function of the native gene is not essential 
fp for cell function. HPLC analysis of the strain carrying the knockout ioxslrl 737 produced no 
CI detectable levels of tocopherol. 

4C. Phytyl Prenyltransferase Enzyme Assays 

[ 3 H] Homogentisic acid in 0.1% H 3 P0 4 (specific radioactivity 40 Ci/mmol). Phytyl 
25 pyrophosphate was synthesized as described by Joo, etal. (1973) Can J. Biochem. 51:1527. 2- 
methyl-6-phytylquinol and 2,3-dimethyl-5-phytylquinol were synthesized as described bySoll, et 
al (1980) Phytochemistry 19:215. Homogentisic acid, a, p\ 8, and y-tocopherol, and tocol, were 
purchased commercially. 

The wild-type strain of Synechocystis sp. PCC 6803 was grown in BG1 1 medium with 
3 0 bubbling air at 30°C under 50 uE.m ls 1 fluorescent light, and 70% relative humidity. The growth 
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medium of slrl736 knock-out (potential PPT) strain of this organism was supplemented with 25 
ug mL" 1 kanamycin. Cells were collected from 0.25 to 1 liter culture by centrifugation at 5000g 
for 10 min and stored at -80°C. 

Total membranes were isolated according to Zak's procedures with some modifications (Zak, 
5 et al. (1999) Eur J. Biochem 261:31 1). Cells were broken on a French press. Before the French 
press treatment, the cells were incubated for 1 hour with lysozyme (0.5%, w/v) at 30 °C in a 
medium containing 7 mM EDTA, 5 mM NaCl and 10 mM Hepes-NaOH, pH 7.4. The 
spheroplasts were collected by centrifugation at 5000 # for 10 min and resuspended at 0.1 - 0.5 mg 
chlorophyll-mL" 1 in 20 mM potassium phosphate buffer, pH 7.8. Proper amount of protease 

1 0 inhibitor cocktail and DNAase I from Boehringer Mannheim were added to the solution. French 
press treatments were performed two to three times at 100 MPa. After breakage, the cell 
suspension was centrifuged for 10 min at 5000g to pellet unbroken cells, and this was followed by 

S centrifugation at 100 000 g for 1 hour to collecttotal membranes. The final pellet was resuspended 

% in a buffer containing 50 mM Tris-HCL and 4 mM MgCl 2 . 

fi Chloroplast pellets were isolated from 250 g of spinach leaves obtained from local markets. 

1 Devined leaf sections were cut into grinding buffer (2 1 /250 g leaves) containing 2 mM EDTA, 1 
7 mM MgCl 2 , 1 mM MnCl 2 , 0.33 M sorbitol, 0.1% ascorbic acid, and 50 mM Hepes at pH 7.5. The 
? leaves were homogenized for 3 sec three times in a 1-L blendor, and filtered through 4 layers of 
C mirocloth. The supernatant was then centrifuged at 500C& for 6 min. The chloroplast pellets were 
jp resuspended in small amount of grinding buffer (Douce,ef al Methods in Chloroplast Molecular 

° Biology, 239 (1982) 

Chloroplasts in pellets can be broken in three ways. Chloroplast pellets were first aliquoted 
in 1 mg of chlorophyll per tube, centrifuged at 6000 rpm for 2 min in microcentrifuge, and 
grinding buffer was removed. Two hundred microliters of Triton X-100 buffer (0.1% Triton X- 

2 5 100, 50 mM Tris-HCl pH 7.6 and 4 mM MgCl 2 ) or swelling buffer (10 mM Tris pH 7.6 and 4 

mM MgCl 2 ) was added to each tube and incubated for Vi hour at 4°C. Then the broken 
chloroplast pellets were used for the assay immediately. In addition, broken chloroplasts can also 
be obtained by freezing in liquid nitrogen and stored at -80°C for Vi hour, then used for the assay. 
In some cases chloroplast pellets were further purified with 40%/ 80% percoll gradient to 

3 0 obtain intact chloroplasts. The intact chloroplasts were broken with swelling buffer, then either 
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used for assay or further purified for envelope membranes with 20.5%/ 31.8% sucrose density 
gradient (Sol, et al (1980) supra). The membrane fractions were centrifuged at 100 OOOg for 40 
min and resuspended in 50 mM Tris-HCl pH 7.6, 4 mM MgCl2. 

Various amounts of [ 3 H]HGA, 40 to 60 jjM unlabelled HGA with specific activity in the 
5 range of 0.16 to 4 Ci/mmole were mixed with a proper amount of 1M Tris-NaOH pH 10 to adjust 
pH to 7.6. HGA was reduced for 4 min with a trace amount of solid NaBEL*. In addition to HGA, 
standard incubation mixture (final vol 1 mL) contained 50 mM Tris-HCl, pH 7.6, 3-5 mM MgCk, 
and 100 juM phytyl pyrophosphate. The reaction was initiated by addition of Synechocystis total 
membranes, spinach chloroplast pellets, spinach broken chloroplasts, or spinach envelope 

1 0 membranes. The enzyme reaction was carried out for 2 hour at 23°C or 30°C in the dark or light. 
The reaction is stopped by freezing with liquid nitrogen, and stored at -80PC or directly by 

™ extraction. 

~! A constant amount of tocol was added to each assay mixture and reaction products were 

J; extracted with a 2 mL mixture of chloroform/methanol (1 :2, v/v) to give a monophasic solution. 
J-B NaCl solution (2 mL; 0.9%) was added with vigorous shaking. This extraction procedure was 
;j repeated three times. The organic layer containing theprenylquinones was filtered through a 20 
* mji filter, evaporated under N2, and then resuspended in 100 |iL of ethanol. 
~i The samples were mainly analyzed by Normal-Phase HPLC method (Isocratic 90% Hexane 

%Z and 10% Methyl- t-butyl ether), and use aZorbax silica column, 4.6 x 250 mm. The samples were 
@0 also analyzed by Reversed-Phase HPLC method (Isocratic 0.1% H3PO4 in MeOH), and use a 
Vydac 201HS54 C18 column; 4.6 x 250 mm coupled with an All-tech C18 guard column. The 
amount of products were calculated based on the substrate specific radioactivity, and adjusted 
according to the % recovery based on the amount of internal standard. 

The amount of chlorophyll was determined as described in Arnon (1949) Plant Physiol 24: 1. 
2 5 Amount of protein was determined by the Bradford method using gamma globulin as a standard 
(Bradford, (1976) Anal Biochem. 72:248) 

Results of the assay demonstrate that 2-Methyl-6-Phytylplastoquinone is produced in the 
Synechocystis slrl736 knockout preparations. The results of the phytyl prenyltransferase enzyme 
activity assay for the sir 1736 knock out are presented in Figure 23. 
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4D. Complementation of the slrl736 knockout with ATPT2 

In order to determine whether ATPT2 could complement the knockout of slrl736 in 
Synechocystis 6803 a plasmid was constructed to express the ATPT2 sequence from the TAC 
promoter. A vector, plasmid psll21 1, was obtained from the lab of Dr. Himadri Pakrasi of 
5 Washington University, and is based on the plasmid RSF1010 which is a broad host range 

plasmid (Ng W.-O., Zentella R., Wang, Y., Taylor J-S. A., Pakrasi, H.B. 2000. phrA, the major 
photoreactivating factor in the cyanobacterium Synechocystis sp. strain PCC 6803 codes for a 
cyclobutane pyrimidine dimer specific DNA photolyase. Arch. Microbiol, (in press)). The 
ATPT2 gene was isolated from the vector pCGN10817 by PCR using the following primers. 
10 ATPT2nco.pr 5 ' -CC ATGGATTCG AGTA A AGTTGTCGC (SEQ ID NO:89); ATPT2ri.pr- 5'- 

GAATTCACTTCAAAAAAGGTAACAG (SEQ ID NO:90). These primers will remove 
w approximately 1 12 BP from the 5' end of the ATPT2 sequence, which is thought to be the 
3 chloroplast transit peptide. These primers will also add an Ncol site at the 5' end and an EcoRI 
f site at the 3 ' end which can be used for sub-cloning into subsequent vectors. The PCR product 
fi from using these primers and pCGN 108 17 was ligated into pGEM T easy and the resulting 
£ vector pMON21689 was confirmed by sequencing using the ml3forward and ml3reverse 
7 primers. The NcoI/EcoRI fragment from pMON21689 was then ligated with the Eagl/EcoRI 
5 and Eagl/Ncol fragments from psll21 1 resulting in pMON21690. The plasmid pMON21690 
i± was introduced into the slrl736 Synechocystis 6803 KO strain via conjugation. Cells of sl906 (a 
jp helper strain) and DH10B cells containing pMON21690 were grown to log phase (O.D. 600= 
® 0.4) and 1 ml was harvested by centrifugation. The cell pellets were washed twice with a sterile 
BG-1 1 solution and resuspended in 200 ul of BG-1 1. The following was mixed in a sterile 
eppendorftube: 50ulSL906, 50 ul DH10B cells containing pMON2 1690, and 100 ul of a fresh 
culture of the sir 1736 Synechocystis 6803 KO strain (O.D. 730 = 0.2-0.4). The cell mixture was 

2 5 immediately transferred to a nitrocellulose filter resting on BG-1 1 and incubated for 24 hours at 

30C and 2500 LUX(50 ue) of light. The filter was then transferred to BG-1 1 supplemented with 
lOug/ml Gentamycin and incubated as above for ~5 days. When colonies appeared, they were 
picked and grown up in liquid BG-1 1 + Gentamycin 10 ug/ml. (Elhai, J. and Wolk, P. 1988. 
Conjugal transfer of DNA to Cyanobacteria. Methods in Enzymology 167, 747-54) The liquid 

3 0 cultures were then assayed for tocopherols by harvesting 1ml of culture by centrifugation, 
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extracting with ethanol/pyrogallol, and HPLC separation. The sir 1736 Synechocystis 6803 KO 
strain, did not contain any detectable tocopherols, while the slrl736 Synechocystis 6803 KO 
strain transformed with pmon21690 contained detectable alpha tocopherol. A Synechocystis 
6803 strain transformed with psll21 1 (vector control) produced alpha tocopherol as well. 

5 

Example 5: Transgenic Plant Analysis 

Arabidopsis plants transformed with constructs for the sense or antisense expression of 

the ATPT proteins were analyzed by High Pressure Liquid Chromatography (HPLC) for altered 
10 levels of total tocopherols, as well as altered levels of specific tocopherols (alpha, beta, gamma, 

and delta tocopherol). 

Extracts of leaves and seeds were prepared for HPLC as follows. For seed extracts, 10 
;3 mg of seed was added to 1 g of microbeads (Biospec) in a sterile microfuge tube to which 500 ul 
JC 1 % pyrogallol (Sigma Chem)/ethanol was added. The mixture was shaken for 3 minutes in a 
Sp mini Beadbeater (Biospec) on "fast" speed. The extract was filtered through a 0.2 um filter into 
^ an autosampler tube. The filtered extracts were then used in HPLC analysis described below. 
3 Leaf extracts were prepared by mixing 30-50 mg of leaf tissue with 1 g microbeads and 

= I freezing in liquid nitrogen until extraction. For extraction, 500 ul 1% pyrogallol in ethanol was 

added to the leaf/bead mixture and shaken for 1 minute on a Beadbeater (Biospec) on "fast" 
gp speed. The resulting mixture was centrifuged for 4 minutes at 14,000 rpm and filtered as 

described above prior to HPLC analysis. 

HPLC was performed on a Zorbax silica HPLC column (4.6 mm X 250 mm) with a 

fluorescent detection, an excitation at 290 nm, an emission at 336 nm, and bandpass and slits. 

Solvent A was hexane and solvent B was methyl-t-butyl ether. The injection volume was 20 ul, 
2 5 the flow rate was 1 .5 ml/min, the run time was 12 min (40°C) using the gradient (Table 5): 



Table 5; 



30 



Time 
Omin. 

10 min. 

11 min, 

12 min. 



90% 
90% 
25% 
90% 



Solvent A 



10% 
10% 
75% 
10% 



Solvent B 
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Tocopherol standards in 1% pyrogallol/ ethanol were also run for comparison (alpha 
tocopherol, gamma tocopherol, beta tocopherol, delta tocopherol, and tocopherol (tocol) (all from 
Matreya). 

5 Standard curves for alpha, beta, delta, and gamma tocopherol were calculated using 

Chemstation software. The absolute amount of component x is: Absolute amount of x= 
Response x x RF X x dilution factor where Response x is the area of peak x, RF X is the response 
factor for component x (Amount x /Response x ) and the dilution factor is 500 uL The ng/mg tissue 
is found by: total ng component/mg plant tissue. 

10 Results of the HPLC analysis of seed extracts of transgenic Arabidopsis lines containing 

pMON 10822 for the expression of ATAT2 from the napin promoter are provided in Figure 24. 

^ HPLC analysis results of Arabidopsis seed tissue expressing the ATAT2 sequence from 

ft the napin promoter (pMON 10822) demonstrates an increased level of tocopherols in the seed. 

S Total tocopherol levels are increased as much as 50 to 60% over the total tocopherol levels of 

Sp non-transformed (wild-type) Arabidopsis plants (Figure 24). 

y Furthermore, increases of particular tocopherols are also increased in transgenic 

^ Arabidopsis plants expressing the ATAT2 nucleic acid sequence from the napin promoter. 
|„: Levels of delta tocopherol in these lines are increased greater than 3 fold over the delta 
C tocopherol levels obtained from the seeds of wild type Arabidopsis lines. Levels of gamma 
SiO tocopherol in transgenic Arabidopsis lines expressing the ATAT2 nucleic acid sequence are 

increased as much as about 60% over the levels obtained in the seeds of non-transgenic control 
lines. Furthermore, levels of alpha tocopherol are increased as much as 3 fold over those 
obtained from non-transgenic control lines. 

Results of the HPLC analysis of seed extracts of transgenic Arabidopsis lines containing 

2 5 pMON 10803 for the expression of ATAT2 from the enhanced 35S promoter are provided in 

Figure 25. 

All publications and patent applications mentioned in this specification are indicative of 

3 0 the level of skill of those skilled in the art to which this invention pertains. All publications and 

42 



Attorney Docket No: 
17133/02/US 

patent applications are herein incorporated by reference to the same extent as if each individual 
publication or patent application was specifically and individually indicated to be incorporated by 
reference. 

Although the foregoing invention has been described in some detail by way of illustration 
and example for purposes of clarity of understanding, it will be obvious that certain changes and 
modifications may be practiced within the scope of the appended claim. 
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Claims 

What is Claimed is: 

1. An isolated nucleic acid sequence encoding a prenyltransferase. 

2. An isolated nucleic acid sequence according to Claim 1, wherein said prenyltransferase is 
selected from the group consisting of straight chain prenyltransferase and aromatic prenyltransferase. 

3. An isolated DNA sequence according to Claim 1, wherein said nucleic acid sequence is 
isolated from a eukaryotic cell source. 

4. An isolated DNA sequence according to Claim 3, wherein said eukaryotic cell source is 
selected from the group consisting of mammalian, nematode, fungal, and plant cells. 

5. The DNA encoding sequence of Claim 4 wherein said prenyltransferase protein is from 
Arabidopsis. 

6. The DNA encoding sequence of Claim 5 wherein said prenyltransferase protein is encoded by 
a sequence selected from the group consisting of the sequences of Figure 1. 

7. The DNA encoding sequence of Claim 4 wherein said prenyltransferase protein is from com. 

8. The DNA encoding sequence of Claim 7 wherein said prenyltransferase protein is encoded by 
a sequence which includes the EST of the sequences of Figure 3. 

9. The DNA encoding sequence of Claim 4 wherein said prenyltransferase protein is from 
soybean. 

10. The DNA encoding sequence of Claim 9 wherein said prenyltransferase protein is encoded 
by a sequence which includes the ESTs of the group consisting of the sequences of Figure 2 and Figure 
9. 

1 1. An isolated DNA sequence according to Claim 1, wherein said nucleic acid sequence is 
isolated from a prokaryotic cell source. 

12. An isolated DNA sequence according to Claim 1 1, wherein said prokaryotic source is 

Synechocystis. 

13. A nucleic acid construct comprising as operably linked components, a transcriptional 
initiation region functional in a host cell, a nucleic acid sequence encoding prenyltransferase, and a 
transcriptional termination region. 
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14. A nucleic acid construct according to Claim 13, wherein said nucleic acid sequence encoding 
prenyltransferase is obtained from an organism selected from the group consisting of a eukaryotic 
organism and a prokaryotic organism. 

15. A nucleic acid construct according to Claim 14, wherein said nucleic acid sequence encoding 
5 prenyltransferase is obtained from a plant source. 

16. A nucleic acid construct according to Claim 15, wherein said nucleic acid sequence encoding 
prenyltransferase is obtained from a source selected from the group consisting of Arabidopsis, soybean 
and corn. 

17. A nucleic acid construct according to Claim 13, wherein said nucleic acid sequence encoding 
1 0 prenyltransferase is obtained from Synechocystis. 

18. A plant cell comprising the construct of Claiml3. 

C 19. A method for the alteration of the tocopherol content in a host cell, comprising; transforming 

^ said host cell with a construct comprising as operably linked components, a transcriptional initiation 
^; ■ region functional in a host cell, a nucleic acid sequence encoding prenyltransferase, and a transcriptional 
gf> termination region. 

% 20. The method according to Claim 19, wherein said host cell is selected from the group 

!l consisting of a prokaryotic cell and a eukaryotic cell. 

*p 21. The method according to Claim 20, wherein said prokaryotic cell is Synechocystis. 

22. The method according to Claim 20, wherein said eukaryotic cell is a plant cell. 
§0 23. The method according to Claim 22, wherein said plant cell is obtained from a plant selected 

from the group consisting of Arabidopsis, soybean, and corn. 

24. A method for producing a tocopherol compound of interest in a host cell, said method 
comprising obtaining a transformed host cell, said host cell having and expressing in its genome: 

a construct having a DNA sequence encoding a prenyltransferase operably linked to a 

2 5 transcriptional initiation region functional in a host cell, 

wherein said prenyltransferase is involved in the synthesis of tocopherols. 

25. The method according to Claim 24, wherein said host cell is selected from the group 
consisting of a prokaryotic cell and a eukaryotic cell. 

26. The method according to Claim 25, wherein said prokaryotic cell is Synechocystis. 

3 0 27. The method according to Claim 24, wherein said eukaryotic cell is a plant cell. 
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28. The method according to Claim 27, wherein said plant cell is obtained from a plant selected 
from the group consisting of Arabidopsis, soybean, and corn. 

29. A method for increasing the biosynthetic flux in cell from a host cell toward 
tocopherol production, said method comprising transforming said host cell with a construct 
comprising as operably linked components, a transcriptional initiation region functional in a host 
cell, a DNA encoding a prenyltransferase involved in the synthesis of tocopherols, and a 
transcriptional termination region. 

30. The method according to Claim 29, wherein said host cell is selected from the group 
consisting of a prokaryotic cell and a eukaryotic cell. 

31. The method according to Claim 30, wherein said prokaryotic cell is Synechocystis. 

32. The method according to Claim 30, wherein said eukaryotic cell is a plant cell. 

33. The method according to Claim 32, wherein said plant cell is obtained from a plant selected 
from the group consisting of Arabidopsis, soybean, and corn. 
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NUCLEIC ACID SEQUENCES TO PROTEINS INVOLVED IN TOCOPHEROL 

SYNTHESIS 

5 

Abstract 

Nucleic acid sequences and methods are provided for producing plants and seeds having 
altered tocopherol content and compositions. The methods find particular use in increasing the 
tocopherol levels in plants, and in providing desirable tocopherol compositions in a host plant 
10 cell. 
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SEQUENCE LISTING 

<110> Lassner, M 

Post-Beittenmiller, D 

Savidge, B 
Weiss, J 

<120> Nucleic Acid Sequences Involved in 
Tocopherol Synthesis 

<130> 17133/02/WO 

<150> 60/129,899 
<151> 1999-04-15 

<150> 60/146,461 
<151> 1999-07-30 

<160> 94 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 1182 
<212> DNA 

<213> Arabidopsis sp 
<400> 1 

atggagtctc tgctctctag ttcttctctt gtttccgctg ctggtgggtt ttgttggaag 
aagcagaatc taaagctcca ctctttatca gaaatccgag ttctgcgttg tgattcgagt 
aaagttgtcg caaaaccgaa gtttaggaac aatcttgtta ggcctgatgg tcaaggatct 
tcattgttgt tgtatccaaa acataagtcg agatttcggg ttaatgccac tgcgggtcag 
cctgaggctt tcgactcgaa tagcaaacag aagtctttta gagactcgtt agatgcgttt 
tacaggtttt ctaggcctca tacagttatt ggcacagtgc ttagcatttt atctgtatct 
ttcttagcag tagagaaggt ttctgatata tctcctttac ttttcactgg catcttggag 
gctgttgttg cagctctcat gatgaacatt tacatagttg ggctaaatca gttgtctgat 
gttgaaatag ataaggttaa caagccctat cttccattgg catcaggaga atattctgtt 
aacaccggca ttgcaatagt agcttccttc tccatcatga gtttctggct tgggtggatt 
gttggttcat ggccattgtt ctgggctctt tttgtgagtt tcatgctcgg tactgcatac 
tctatcaatt tgccactttt acggtggaaa agatttgcat tggttgcagc aatgtgtatc 
ctcgctgtcc gagctattat tgttcaaatc gccttttatc tacatattca gacacatgtg 



1 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
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tttggaagac 


caatcttgtt 


cactaggcct 


cttattttcg 


ccactgcgtt 


tatgagcttt 


840 


ttctctgtcg 


ttattgcatt 


gtttaaggat 


atacctgata 


tcgaagggga 


taagatattc 


900 


ggaatccgat 


cattctctgt 


aactctgggt 


cagaaacggg 


tgttttggac 


atgtgttaca 


960 


ctacttcaaa 


tggcttacgc 


tgttgcaatt 


ctagttggag 


ccacatctcc 


attcatatgg 


1020 


agcaaagtca 


tctcggttgt 


gggtcatgtt 


atactcgcaa 


caactttgtg 


ggctcgagct 


1080 


aagtccgttg 


atctgagtag 


caaaaccgaa 


ataacttcat 


gttatatgtt 


catatggaag 


1140 


ctcttttatg 


cagagtactt 


gctgttacct 


tttttgaagt 


ga 




1182 



<210> 2 
10 <211> 393 
<212> PRT 

<213> Arabidopsis sp 



<400> 2 

15 Met Glu Ser Leu Leu Ser Ser Ser Ser Leu Val Ser Ala Ala Gly Gly 
15 10 15 

Phe Cys Trp Lys Lys Gin Asn Leu Lys Leu His Ser Leu Ser Glu lie 
^ 20 25 30 

fr? Arg Val Leu Arg Cys Asp Ser Ser Lys Val Val Ala Lys Pro Lys Phe 
ISO 35 40 45 

€* Arg Asn Asn Leu Val Arg Pro Asp Gly Gin Gly Ser Ser Leu Leu Leu 
SC; 50 55 60 

T Tyr Pro Lys His Lys Ser Arg Phe Arg Val Asn Ala Thr Ala Gly Gin 

m 65 70 75 80 

ff>5 Pro Glu Ala Phe Asp Ser Asn Ser Lys Gin Lys Ser Phe Arg Asp Ser 
ji 85 90 95 

Leu Asp Ala Phe Tyr Arg Phe Ser Arg Pro His Thr Val lie Gly Thr 
jF 100 105 110 

™ Val Leu Ser lie Leu Ser Val Ser Phe Leu Ala Val Glu Lys Val Ser 

"50 115 120 125 

Asp lie Ser Pro Leu Leu Phe Thr Gly lie Leu Glu Ala Val Val Ala 

130 135 140 

Ala Leu Met Met Asn lie Tyr lie Val Gly Leu Asn Gin Leu Ser Asp 
145 150 155 160 

3 5 Val Glu lie Asp Lys Val Asn Lys Pro Tyr Leu Pro Leu Ala Ser Gly 

165 170 175 

Glu Tyr Ser Val Asn Thr Gly lie Ala lie Val Ala Ser Phe Ser lie 

180 185 190 

Met Ser Phe Trp Leu Gly Trp lie Val Gly Ser Trp Pro Leu Phe Trp 
40 195 200 205 

Ala Leu Phe Val Ser Phe Met Leu Gly Thr Ala Tyr Ser lie Asn Leu 

210 215 220 

Pro Leu Leu Arg Trp Lys Arg Phe Ala Leu Val Ala Ala Met Cys lie 



2 
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225 


230 


235 




240 






Leu Ala Val 


Arg Ala He He Val Gin He Ala 


Phe Tyr Leu 


His He 








245 




250 




255 






Gin Thr His 


Val Phe Gly Arg Pro He Leu Phe 


Thr Arg Pro 


Leu He 




5 




260 


265 


270 








Phe Ala Thr 


Ala Phe Met Ser Phe Phe Ser Val 


Val He Ala 


Leu Phe 






275 




280 




285 








Lys Asp lie 


Pro Asp He Glu Gly Asp Lys He 


Phe Gly He 


Arg Ser 






290 




295 




300 






10 


Phe Ser Val 


Thr Leu Gly Gin Lys Arg Val Phe 


Trp Thr Cys 


Val Thr 






305 


310 


315 




320 






Leu Leu Gin 


Met Ala Tyr Ala Val Ala He Leu 


Val Gly Ala 


Thr Ser 








325 




330 




335 






Pro Phe lie 


Trp Ser Lys Val He Ser Val Val 


Gly His Val 


He Leu 




15 




340 


345 


350 








Ala Thr Thr 


Leu Trp Ala Arg Ala Lys Ser Val 


Asp Leu Ser 


Ser Lys 






355 




360 




365 








Thr Glu lie 


Thr Ser Cys Tyr Met Phe He Trp 


T t re* T an "D Vv (is 
LiyS JjcU. irlitr 


Tyr Ala 




a 


370 




375 




380 






=30 


Glu Tyr Leu 


Leu Leu Pro Phe Leu Lys 








EO 


385 


390 












<210> 3 
















<211> 1224 
















<212> DNA 
















<213> Arabidopsis sp 














<400> 3 
















atggcgtttt 


ttgggctctc 


ccgtgtttca 


agacggttgt 


tgaaatcttc 


cgtctccgta 


60 


30 


actccatctt 


cttcctctgc 


tcttttgcaa 


tcacaacata 


aatccttgtc 


caatcctgtg 


120 




actacccatt 


acacaaatcc 


tttcactaag 


tgttatcctt 


catggaatga 


taattaccaa 


180 




gtatggagta 


aaggaagaga 


attgcatcag 


gagaagtttt 


ttggtgttgg 


ttggaattac 


240 




agattaattt 


gtggaatgtc 


gtcgtcttct 


tcggttttgg 


agggaaagcc 


gaagaaagat 


300 




gataaggaga 


agagtgatgg 


tgttgttgtt 


aagaaagctt 


cttggataga 


tttgtattta 


360 


35 


ccagaagaag 


ttagaggtta 


tgctaagctt 


gctcgattgg 


ataaacccat 


tggaacttgg 


420 




ttgcttgcgt 


ggccttgtat 


gtggtcgatt 


gcgttggctg 


ctgatcctgg 


aagccttcca 


480 




agttttaaat 


atatggcttt 


atttggttgc 


ggagcattac 


ttcttagagg 


tgctggttgt 


540 




actataaatg 


atctgcttga 


tcaggacata 


gatacaaagg 


ttgatcgtac 


aaaactaaga 


600 




cctatcgcca 


gtggtctttt 


gacaccattt 


caagggattg 


gatttctcgg 


gctgcagttg 


660 


40 


cttttaggct 


tagggattct 


tctccaactt 


aacaattaca 


gccgtgtttt 


aggggcttca 


720 




tctttgttac 


ttgtcttttc 


ctacccactt 


atgaagaggt 


ttacattttg 


gcctcaagcc 


780 




tttttaggtt 


tgaccataaa 


ctggggagca 


ttgttaggat 


ggactgcagt 


taaaggaagc 


840 




atagcaccat 


ctattgtact 


ccctctctat 


ctctccggag 


tctgctggac 


ccttgtttat 


900 
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gatactattt atgcacatca ggacaaagaa 


gatgatgtaa 


aagttggtgt taagtcaaca 


9i 


50 


gcccttagat tcggtgataa tacaaagctt 


tggttaactg 


gatttggcac agcatccata 


1020 


ggttttcttg cactttctgg attcagtgca 


gauc ucgggc 


ggcaatatta cgcatcactg 


10£ 


30 


gccgctgcat caggacagtt aggatggcaa 


atagggacag 


ctgacttatc atctggtgct 


1140 


gactgcagta gaaaatttgt gtcgaacaag 


tggtttggtg 


ctattatatt tagtggagtt 


1200 


gtacttggaa gaagttttca ataa 












1224 


<210> 4 
















<211> 407 
















<212> PRT 
















<213> Arabidopsis sp 
















<400> 4 
















Met Ala Phe Phe Gly Leu Ser Arg 


Val Ser Arg 


Arg 


Leu 


Leu Lys 


Ser 






1 5 


10 






15 








Ser Val Ser Val Thr Pro Ser Ser 


Ser Ser Ala 


Leu 


Leu 


Gin Ser 


Gin 






20 


25 






30 








His Lys Ser Leu Ser Asn Pro Val 


Thr Thr His 


Tyr 


Thr 


Asn Pro 


Phe 






35 40 






45 










Thr Lys Cys Tyr Pro Ser Trp Asn 


Asp Asn Tyr 


Gin 


Val 


Trp Ser 


Lys 






50 55 




60 












Gly Arg Glu Leu His Gin Glu Lys 


Phe Phe Gly 


Val 


Gly 


Trp Asn 


Tyr 






65 70 


75 








80 






Arg Leu lie Cys Gly Met Ser Ser 


Ser Ser Ser 


Val 


Leu 


Glu Gly 


Lys 






85 


90 






95 








Pro Lys Lys Asp Asp Lys Glu Lys 


Ser Asp Gly 


Val 


Val 


Val Lys 


Lys 






100 


105 






110 








Ala Ser Trp He Asp Leu Tyr Leu 


Pro Glu Glu 


Val 


Arg 


Gly Tyr 


Ala 






115 120 






125 










Lys Leu Ala Arg Leu Asp Lys Pro 


He Gly Thr 


Trp 


Leu 


Leu Ala 


Trp 






130 135 




140 












Pro Cys Met Trp Ser He Ala Leu 


Ala Ala Asp 


Pro 


Gly 


Ser Leu 


Pro 






145 150 


155 








160 






Ser Phe Lys Tyr Met Ala Leu Phe 


Gly Cys Gly 


Ala 


Leu 


Leu Leu 


Arg 






165 


170 






175 








Gly Ala Gly Cys Thr He Asn Asp 


Leu Leu Asp 


Gin 


Asp 


He Asp 


Thr 






180 


185 






190 








Lys Val Asp Arg Thr Lys Leu Arg 


Pro He Ala 


Ser 


Gly 


Leu Leu 


Thr 






195 200 






205 










Pro Phe Gin Gly He Gly Phe Leu 


Gly Leu Gin 


Leu 


Leu 


Leu Gly 


Leu 






210 215 




220 












Gly He Leu Leu Gin Leu Asn Asn 


Tyr Ser Arg 


Val 


Leu 


Gly Ala 


Ser 






225 230 


235 








240 
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Ser Leu Leu Leu Val Phe Ser Tyr Pro Leu Met Lys Arg Phe Thr Phe 

245 250 255 

Trp Pro Gin Ala Phe Leu Gly Leu Thr He Asn Trp Gly Ala Leu Leu 
260 265 270 

5 Gly Trp Thr Ala Val Lys Gly Ser He Ala Pro Ser He Val Leu Pro 
275 280 285 

Leu Tyr Leu Ser Gly Val Cys Trp Thr Leu Val Tyr Asp Thr He Tyr 

290 295 300 

Ala His Gin Asp Lys Glu Asp Asp Val Lys Val Gly Val Lys Ser Thr 
10 305 310 315 320 

Ala Leu Arg Phe Gly Asp Asn Thr Lys Leu Trp Leu Thr Gly Phe Gly 

325 330 335 

Thr Ala Ser He Gly Phe Leu Ala Leu Ser Gly Phe Ser Ala Asp Leu 
340 345 350 

15 Gly Trp Gin Tyr Tyr Ala Ser Leu Ala Ala Ala Ser Gly Gin Leu Gly 
355 360 365 

Trp Gin He Gly Thr Ala Asp Leu Ser Ser Gly Ala Asp Cys Ser Arg 
Jli 370 375 380 

m Lys Phe Val Ser Asn Lys Trp Phe Gly Ala He He Phe Ser Gly Val 
Jo 385 390 395 400 

''B val Leu Gly Arg Ser Phe Gin 

405 

J" <210> 5 

35 <211> 1296 

*S <212> DNA 

^Z" <213> Arabidopsis sp 



30 



35 



40 



<400> 5 














atgtggcgaa 


gatctgttgt 


ttctcgttta 


tcttcaagaa 


tctctgtttc 


ttcttcgtta 


60 


ccaaacccta 


gactgattcc 


ttggtcccgc 


gaattatgtg 


ccgttaatag 


cttctcccag 


120 


cctccggtct 


cgacggaatc 


aactgctaag 


ttagggatca 


ctggtgttag 


atctgatgcc 


180 


aatcgagttt 


ttgccactgc 


tactgccgcc 


gctacagcta 


cagctaccac 


cggtgagatt 


240 


tcgtctagag 


ttgcggcttt 


ggctggatta 


gggcatcact 


acgctcgttg 


ttattgggag 


300 


ctttctaaag 


ctaaacttag 


tatgcttgtg 


gttgcaactt 


ctggaactgg 


gtatattctg 


360 


ggtacgggaa 


atgctgcaat 


tagcttcccg 


gggctttgtt 


acacatgtgc 


aggaaccatg 


420 


atgattgctg 


catctgctaa 


ttccttgaat 


cagatttttg 


agataagcaa 


tgattctaag 


480 


atgaaaagaa 


cgatgctaag 


gccattgcct 


tcaggacgta 


ttagtgttcc 


acacgctgtt 


540 


gcatgggcta 


ctattgctgg 


tgcttctggt 


gcttgtttgt 


tggccagcaa 


gactaatatg 


600 


ttggctgctg 


gacttgcatc 


tgccaatctt 


gtactttatg 


cgtttgttta 


tactccgttg 


660 


aagcaacttc 


accctatcaa 


tacatgggtt 


ggcgctgttg 


ttggtgctat 


cccacccttg 


720 


cttgggtggg 


cggcagcgtc 


tggtcagatt 


tcatacaatt 


cgatgattct 


tccagctgct 


780 


ctttactttt 


ggcagatacc 


tcattttatg 


gcccttgcac 


atctctgccg 


caatgattat 


840 



5 
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gcagctggag 


gttacaagat 


gttgtcactc 


tttgatccgt 


cagggaagag 


aatagcagca 


900 


gtggctctaa 


ggaactgctt 


ttacatgatc 


cctctcggtt 


tcatcgccta 


tgactggggg 


960 


ttaacctcaa 


gttggttttg 


cctcgaatca 


acacttctca 


cactagcaat 


cgctgcaaca 


1020 


gcattttcat 


tctaccgaga 


ccggaccatg 


cataaagcaa 


ggaaaatgtt 


ccatgccagt 


1080 


cttctcttcc 


ttcctgtttt 


catgtctggt 


cttcttctac 


accgtgtctc 


taatgataat 


1140 


cagcaacaac 


tcgtagaaga 


agccggatta 


acaaattctg 


tatctggtga 


agtcaaaact 


1200 


cagaggcgaa 


agaaacgtgt 


ggctcaacct 


ccggtggctt 


atgcctctgc 


tgcaccgttt 


1260 


cctttcctcc 


cagctccttc 


cttctactct 


ccatga 






1296 



10 <210> 6 

<211> 431 

<212> PRT 

<213> Arabidopsis sp 



15 <400> 6 

Met Trp Arg 
1 

JP, Ser Ser Ser 
23D Cys Ala Val 

m 35 

Ala Lys Leu 

g Ala Thr Ala 

& 65 

Ser Ser Arg 

2 Cys Tyr Trp 

30 Thr Ser Gly 
115 

Phe Pro Gly 

130 
Ser Ala Asn 
35 145 

Met Lys Arg 

Pro His Ala 

40 Leu Leu Ala 
195 

Asn Leu Val 
210 



Arg Ser Val Val 
5 

Leu Pro Asn Pro 
20 

Asn Ser Phe Ser 

Gly lie Thr Gly 
55 

Thr Ala Ala Ala 
70 

Val Ala Ala Leu 
85 

Glu Leu Ser Lys 
100 

Thr Gly Tyr lie 

Leu Cys Tyr Thr 
135 

Ser Leu Asn Gin 
150 

Thr Met Leu Arg 
165 

Val Ala Trp Ala 
180 

Ser Lys Thr Asn 

Leu Tyr Ala Phe 
215 



Tyr Arg Phe Ser Ser 
10 

Arg Leu lie Pro Trp 
25 

Gin Pro Pro Val Ser 
40 

Val Arg Ser Asp Ala 
60 

Thr Ala Thr Ala Thr 
75 

Ala Gly Leu Gly His 
90 

Ala Lys Leu Ser Met 
105 

Leu Gly Thr Gly Asn 
120 

Cys Ala Gly Thr Met 
140 

lie Phe Glu lie Ser 
155 

Pro Leu Pro Ser Gly 
170 

Thr lie Ala Gly Ala 
185 

Met Leu Ala Ala Gly 
200 

Val Tyr Thr Pro Leu 
220 



Arg lie Ser Val 
15 

Ser Arg Glu Leu 
30 

Thr Glu Ser Thr 
45 

Asn Arg Val Phe 

Thr Gly Glu He 
80 

His Tyr Ala Arg 
95 

Leu Val Val Ala 
110 

Ala Ala He Ser 
125 

Met He Ala Ala 

Asn Asp Ser Lys 
160 

Arg He Ser Val 
175 

Ser Gly Ala Cys 
190 

Leu Ala Ser Ala 
205 

Lys Gin Leu His 
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Pro lie Asn 


Thr Trp Val Gly Ala Val Val Gly 


Ala lie Pro 


Pro Leu 






225 


230 ' 235 




240 






Leu Gly Trp 


Ala Ala Ala Ser Gly Gin He Ser 


Tyr Asn Ser 


Met He 








245 250 




255 




5 


Leu Pro Ala 


Ala Leu Tyr Phe Trp Gin He Pro 


His Phe Met 


Ala Leu 








260 265 


270 








Ala His Leu 


Cys Arg Asn Asp Tyr Ala Ala Gly 


Gly Tyr Lys 


Met Leu 






275 


280 


285 








Ser Leu Phe 


Asp Pro Ser Gly Lys Arg He Ala 


Ala Val Ala 


Leu Arg 1 




10 


290 


295 


300 








Asn Cys Phe 


Tyr Met He Pro Leu Gly Phe He 


Ala Tyr Asp 


Trp Gly 






305 


310 315 




320 






Leu Thr Ser 


Ser Trp Phe Cys Leu Glu Ser Thr 


Leu Leu Thr 


Leu Ala 








325 330 




335 




15 


lie Ala Ala 


Thr Ala Phe Ser Phe Tyr Arg Asp 


Arg Thr Met 


His Lys 








340 345 


350 








Ala Arg Lys 


Met Phe His Ala Ser Leu Leu Phe 


Leu Pro Val 


Phe Met 




CI 


355 


360 


365 






~! 


Ser Gly Leu 


Leu Leu His Arg Val Ser Asn Asp 


Asn Gin Gin 


Gin Leu 






370 


375 


380 








Val Glu Glu 


Ala Gly Leu Thr Asn Ser Val Ser 


Gly Glu Val 


Lys Thr 




03 


385 


390 395 




400 






Gin Arg Arg 


Lys Lys Arg Val Ala Gin Pro Pro 


Val Ala Tyr 


Ala Ser 




00 




405 410 




415 






Ala Ala Pro 


Phe Pro Phe Leu Pro Ala Pro Ser 


Phe Tyr Ser 


Pro 




a*" 
jMa 




420 425 


430 








<210> 7 












<211> 479 










To 


<212> DNA 












<213> Arabidopsis sp 










<400> 7 












ggaaactccc 


ggagcacctg tttgcaggta ccgctaacct 


taatcgataa 


tttatttctc 


60 


35 


ttgtcaggaa 


ttatgtaagt ctggtggaag gctcgcatac 


catttttgca 


ttgcctttcg 


120 




ctatgatcgg 


gtttactttg ggtgtgatga gaccaggcgt 


ggctttatgg 


tatggcgaaa 


180 




acccattttt 


atccaatgct gcattccctc ccgatgattc 


gttctttcat 


tcctatacag 


240 




gtatcatgct 


gataaaactg ttactggtac tggtttgtat 


ggtatcagca 


agaagcgcgg 


300 




cgatggcgtt 


taaccggtat ctcgacaggc attttgacgc 


gaagaacccg 


cgtactgcca 


360 


40 


tccgtgaaat 


acctgcgggc gtcatatctg ccaacagtgc 


gctggtgttt 


acgataggct 


420 




gctgcgtggt 


attctgggtg gcctgttatt tcattaacac 


gatctgtttt 


tacctggcg 


479 



<210> 8 
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<211> 551 
<212> DNA 

<213> Arabidopsis sp 
<220> 

<221> misc__feature 
<222> (1) . . . (551) 
<223> n = A,T,C or G 



<400> 8 














ttgtggctta 


caccttaatg 


agcatacgcc 


agnccattac 


ggctcgttaa 


tcggcgccat 


60 


ngccggngct 


gntgcaccgg 


tagtgggcta 


ctgcgccgtg 


accaatcagc 


ttgatctagc 


120 


ggctcttatt 


ctgtttttaa 


ttttactgtt 


ctggcaaatg 


ccgcattttt 


acgcgatttc 


180 


cattttcagg 


ctaaaagact 


tttcagcggc 


ctgtattccg 


gtgctgccca 


tcattaaaga 


240 


cctgcgctat 


accaaaatca 


gcatgctggt 


ttacgtgggc 


ttatttacac 


tggctgctat 


300 


catgccggcc 


ctcttagggt 


atgccggttg 


gatttatggg 


atagcggcct 


taattttagg 


360 


cttgtattgg 


ctttatattg 


ccatacaagg 


attcaagacc 


gccgatgatc 


aaaaatggtc 


420 


tcgtaagatg 


tttggatctt 


cgattttaat 


cattaccctc 


ttgtcggtaa 


tgatgcttgt 


480 


ttaaacttac 


tgcctcctga 


agtttatata 


tcgataattt 


cagcttaagg 


aggcttagtg 


540 


gttaattcaa 


t 










551 



<210> 9 
<211> 297 
<212> PRT 

<213> Arabidopsis s{ 
<400> 9 

Met Val Leu Ala Glu 

1 5 
Phe Lys Arg Gly Val 
20 

Leu Met Ala Thr Ala 
35 

Glu Ser Thr Asp lie 
50 

He Ala Glu He Thr 
65 

Asp Val Leu Asp Asp 
85 

Val Val Met Gly Asn 
100 

His Leu Val Thr Gly 
115 



Val Pro Lys Leu Ala 
10 

Gin Gly Lys Gin Phe 
25 

Leu Asn Val Arg Val 
40 

Val Thr Ser Glu Leu 
55 

Glu Met He His Val 
70 

Ala Asp Thr Arg Arg 
90 

Lys Val Val Ala Leu 
105 

Glu Thr Met Glu He 
120 



Ser Ala Ala Glu Tyr Phe 
15 

Arg Ser Thr He Leu Leu 
30 

Pro Glu Ala Leu He Gly 
45 

Arg Val Arg Gin Arg Gly 
60 

Ala Ser Leu Leu His Asp 
75 80 
Gly Val Gly Ser Leu Asn 
95 

Leu Ala Thr Ala Val Glu 
110 

Thr Ser Ser Thr Glu Gin 
125 
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Arg Tyr Ser Met Asp Tyr Tyr Met Gin Lys Thr Tyr Tyr Lys Thr Ala 

130 135 140 

Ser Leu He Ser Asn Ser Cys Lys Ala Val Ala Val Leu Thr Gly Gin 
145 150 155 160 

5 Thr Ala Glu Val Ala Val Leu Ala Phe Glu Tyr Gly Arg Asn Leu Gly 

165 170 175 

Leu Ala Phe Gin Leu He Asp Asp He Leu Asp Phe Thr Gly Thr Ser 

180 185 190 

Ala Ser Leu Gly Lys Gly Ser Leu Ser Asp He Arg His Gly Val He 
10 195 200 205 

Thr Ala Pro He Leu Phe Ala Met Glu Glu Phe Pro Gin Leu Arg Glu 

210 215 220 

Val Val Asp Gin Val Glu Lys Asp Pro Arg Asn Val Asp He Ala Leu 
225 230 235 240 

15 Glu Tyr Leu Gly Lys Ser Lys Gly He Gin Arg Ala Arg Glu Leu Ala 

245 250 255 

r . Met Glu His Ala Asn Leu Ala Ala Ala Ala He Gly Ser Leu Pro Glu 
% 260 265 270 

HI Thr Asp Asn Glu Asp Val Lys Arg Ser Arg Arg Ala Leu He Asp Leu 
S) 275 280 285 

^ Thr His Arg Val He Thr Arg Asn Lys 
^ 290 295 

7 <210> 10 
m <2H> 561 
<212> DNA 
^Zi <213> Arabidopsis sp 



<400> 10 














aagcgcatcc 


gtcctcttct 


acgattgccg 


ccagccgcat 


gtatggctgc 


ataaccgacc 


60 


gcccctatcc 


gctcgcggcc 


gcggtcgaat 


tcattcacac 


cgcgacgctg 


ctgcatgacg 


120 


acgtcgtcga 


tgaaagcgat 


ttgcgccgcg 


gccgcgaaag 


cgcgcataag 


gttttcggca 


180 


atcaggcgag 


cgtgctcgtc 


ggcgatttcc 


ttttctcccg 


cgccttccag 


ctgatggtgg 


240 


aagacggctc 


gctcgacgcg 


ctgcgcattc 


tctcggatgc 


ctccgccgtg 


atcgcgcagg 


300 


gcgaagtgat 


gcagctcggc 


accgcgcgca 


atcttgaaac 


caatatgagc 


cagtatctcg 


360 


atgtgatcag 


cgcgaagacc 


gccgcgctct 


ttgccgccgc 


ctgcgaaatc 


ggcccggtga 


420 


tggcgaacgc 


gaaggcggaa 


gatgctgccg 


cgatgtgcga 


atacggcatg 


aatctcggta 


480 


tcgccttcca 


gatcatcgac 


gaccttctcg 


attacggcac 


cggcggccac 


gccgagcttg 


540 


gcaagaacac 


gggcgacgat 


t 








561 



40 

<210> 11 
<211> 966 
<212> DNA 
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<213> Arabidopsis sp 
<400> 11 

atggtacttg ccgaggttcc aaagcttgcc tctgctgctg agtacttctt caaaaggggt 60 

5 gtgcaaggaa aacagtttcg ttcaactatt ttgctgctga tggcgacagc tctgaatgta 120 

cgcgttccag aagcattgat tggggaatca acagatatag tcacatcaga attacgcgta 180 

aggcaacggg gtattgctga aatcactgaa atgatacacg tcgcaagtct actgcacgat 240 

gatgtcttgg atgatgccga tacaaggcgt ggtgttggtt ccttaaatgt tgtaatgggt 300 

aacaagatgt cggtattagc aggagacttc ttgctctccc gggcttgtgg ggctctcgct 3 60 

10 gctttaaaga acacagaggt tgtagcatta cttgcaactg ctgtagaaca tcttgttacc 420 

ggtgaaacca tggaaataac tagttcaacc gagcagcgtt atagtatgga ctactacatg 480 

cagaagacat attataagac agcatcgcta atctctaaca gctgcaaagc tgttgccgtt 540 

ctcactggac aaacagcaga agttgccgtg ttagcttttg agtatgggag gaatctgggt 600 

ttagcattcc aattaataga cgacattctt gatttcacgg gcacatctgc ctctctcgga 660 

15 aagggatcgt tgtcagatat tcgccatgga gtcataacag ccccaatcct ctttgccatg 720 

gaagagtttc ctcaactacg cgaagttgtt gatcaagttg aaaaagatcc taggaatgtt 780 

^ gacattgctt tagagtatct tgggaagagc aagggaatac agagggcaag agaattagcc 840 

JR atggaacatg cgaatctagc agcagctgca atcgggtctc tacctgaaac agacaatgaa 900 

01 gatgtcaaaa gatcgaggcg ggcacttatt gacttgaccc atagagtcat caccagaaac 9 60 

S) aagtga 966 

^ <210> 12 

<211> 321 
~~ <212> PRT 
§S <213> Arabidopsis sp 

<400> 12 

t Met Val Leu Ala Glu Val Pro Lys Leu Ala Ser Ala Ala Glu Tyr Phe 



Leu Met Ala Thr Ala Leu Asn Val Arg Val Pro Glu Ala Leu lie Gly 

35 40 45 

Glu Ser Thr Asp lie Val Thr Ser Glu Leu Arg Val Arg Gin Arg Gly 
35 50 55 60 

He Ala Glu He Thr Glu Met He His Val Ala Ser Leu Leu His Asp 
65 70 75 80 

Asp Val Leu Asp Asp Ala Asp Thr Arg Arg Gly Val Gly Ser Leu Asn 
85 90 95 

40 Val Val Met Gly Asn Lys Met Ser Val Leu Ala Gly Asp Phe Leu Leu 
100 105 110 

Ser Arg Ala Cys Gly Ala Leu Ala Ala Leu Lys Asn Thr Glu Val Val 
115 120 125 

10 
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Ala Leu Leu Ala Thr Ala Val Glu His Leu Val Thr Gly Glu Thr Met 

130 135 140 

Glu He Thr Ser Ser Thr Glu Gin Arg Tyr Ser Met Asp Tyr Tyr Met 
145 150 155 160 

5 Gin Lys Thr Tyr Tyr Lys Thr Ala Ser Leu He Ser Asn Ser Cys Lys 

165 170 175 

Ala Val Ala Val Leu Thr Gly Gin Thr Ala Glu Val Ala Val Leu Ala 

180 185 190 

Phe Glu Tyr Gly Arg Asn Leu Gly Leu Ala Phe Gin Leu He Asp Asp 
10 195 200 205 

He Leu Asp Phe Thr Gly Thr Ser Ala Ser Leu Gly Lys Gly Ser Leu 

210 215 220 

Ser Asp He Arg His Gly Val He Thr Ala Pro He Leu Phe Ala Met 
225 230 235 240 

15 Glu Glu Phe Pro Gin Leu Arg Glu Val Val Asp Gin Val Glu Lys Asp 

245 250 255 

r ^ Pro Arg Asn Val Asp He Ala Leu Glu Tyr Leu Gly Lys Ser Lys Gly 
Ik 260 265 270 

i:\ He Gin Arg Ala Arg Glu Leu Ala Met Glu His Ala Asn Leu Ala Ala 
2Q 275 280 285 

^ Ala Ala He Gly Ser Leu Pro Glu Thr Asp Asn Glu Asp Val Lys Arg 
^1 290 295 300 

2E Ser Arg Arg Ala Leu He Asp Leu Thr His Arg Val He Thr Arg Asn 
!f' 305 310 315 320 

2J3 Lys 



Jl <210> 13 
S <211> 621 
30 <212> DNA 

<213> Arabidopsis sp 



35 



40 



<400> 13 














gctttctcct 


ttgctaattc 


ttgagctttc 


ttgatcccac 


cgcgatttct 


aactatttca 


60 


atcgcttctt 


caagcgatcc 


aggctcacaa 


aactcagact 


caatgatctc 


tcttagcctt 


120 


ggctcattct 


ctagcgcgaa 


gatcactggc 


gccgttatgt 


tacctttggc 


taagtcatta 


180 


gctgcaggct 


tacctaactg 


ctctgtggac 


tgagtgaagt 


ccagaatgtc 


atcaactact 


240 


tgaaaagata 


aaccgagatt 


cttcccgaac 


tgatacattt 


gctctgcgac 


cttgctttcg 


300 


actttactga 


aaattgctgc 


tcctttggtg 


cttgcagcta 


ctaatgaagc 


tgtcttgtag 


360 


taactcttta 


gcatgtagtc 


atcaagcttg 


acatcacaat 


cgaataaact 


cgatgcttgc 


420 


tttatctcac 


cgcttgcaaa 


atctttgatc 


acctgcaaaa 


agataaatca 


agattcagac 


480 


caaatgttct 


ttgtattgag 


tagcttcatc 


taatctcaga 


aaggaatatt 


acctgactta 


540 


tgagcttaat 


gacttcaagg 


ttttcgagat 


ttgtaagtac 


catgatgctt 


gagcaacatg 


600 
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aaatccccag ctaatacagc t 

<210> 14 
<211> 741 
5 <212> DNA 

<213> Arabidopsis sp 

<400> 14 

ggtgagtttt gttaatagtt atgagattca 
10 gtttaaactc tgtgtataat tgcaggaaag 
gtagcggtgc tagctggaga tttcatgttt 
gagaatcttg aagttattaa gctcatcagt 
ctatgaggtt gagctatgaa tctcatttcg 
catgttttca ggtgatcaaa gactttgcaa 
15 ttgactgcga caccaagctc gacgagtact 
tagtggctgc gagcaccaaa ggagctgcca 
aacaaatgta cgagtttggg aagaatctcg 
,n tggatttcac tcagtcgaca gagcagctcg 
01 gtaacttaac agcacctgtg attttcgctc 
2p£) ttgagtcaaa gttctgtgag gcgggttctc 
*P gtggggggat taagagagca c 

j5 <210> 15 

<211> 1087 
25 <212> DNA 
Hp <213> Arabidopsis sp 

% <400> 15 

i« cctcttcagc caatccagag gaagaagaga 

30 aaaacgcacg gttttatgct ctctcttctg 
ttcaaccaga gggaaaaagc aacgataaca 
tccgcaaagc cgagtctgta aatgcggctc 
ttacgatcca agaagcggtc aggtactctt 
tgctctgcat tgccgcttgt gagcttgtgg 

35 cttgcgcggt cgagatgatc cacacaagct 
acaatgccga cctccgtaga ggcaagccca 
gaaggctcag agataatgct gaactagtgt 
ggagaagaca tggcggtttt ggcaggtgat 
acggttgtgt cgagtgggtt ggtcgctccc 

40 gccagggcca tagggactac agggctagtt 
agactgaatc cagacaaggt tggattggag 
gcggcattgt tggaggcagc ggcagtttta 
gaaatcgaaa agcttagaaa gtatgctagg 



621 



tctatttttg 


tcataaaatt 


gtttggtttg 


£n 

O v 


gaaacagttc 


atgagctttt 


cggcacaaga 


120 


gctcaagcgt 


catggtactt 


agcaaatctc 


180 


caggtactta 


gttactctta 


cattgttttt 


240 


ttgaataatg 


ctgtgcctca 


aacttttttt 


300 


gcggagagat 


aaagcaggcg 


tccagcttat 


360 


tactcaaaag 


tttctacaag 


acagcctctt 


420 


ttttcagcag 


agttgagcct 


gatgtgacag 


480 


gtctctcttt 


ccagatagtt 


gatgatattt 


540 


ggaagccagc 


agggagtgat 


ttggctaaag 


600 


tggagaggga 


gccaaggcta 


agagagatca 


660 


tggaagaagc 


gattgaagcg 


gtgacaaaag 


720 








741 



caacttttta 


tctttcgtca 


agagtctccg 


60 


ccctcacctc 


acaagacgca 


gggcacatga 


120 


actctgcttt 


tgatttcaag 


ctgtatatga 


180 


tcgacgtttc 


cgtaccgctt 


ctgaaacccc 


240 


tgctagccgg 


cggaaaacgt 


gtgaggcctc 


300 


ggggcgacga 


ggctactgcc 


atgtcagccg 


360 


ctctcattca 


tgacgatctt 


ccgtgcatgg 


420 


ccaatcacaa 


ggtatgttgt 


ttaattatat 


480 


tgaaccaatt 


tttgctcaaa 


caaggtatat 


540 


gcactccttg 


cattggcgtt 


tgagcacatg 


600 


gagaagatga 


ttcgcgccgt 


ggttgagctg 


660 


gctggacaaa 


tgatagacct 


agccagcgaa 


720 


catctagagt 


tcatccatct 


ccacaaaacg 


780 


ggggttataa 


tgggaggtgg 


aacagaggaa 


840 


tgtattggac 


tactgtttca 


ggttgttgat 


900 
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gacattctcg 


aegtaacaaa 


ate tact gag 


gaau. uy gy (-a. 


cty at ciy Lv^yy 


CldCLCiy QL« y L-d 


960 




atggccggaa 


agetgaegta 


tccaaggctg 


ataggtttgg 


agggatccag 


ggaagttgca 


1020 




gagcacctga 


ggagagaagc 


agaggaaaag 


cttaaagggt 


ttgatccaag 


teaggeggeg 


1080 


5 


cctctgg 












1087 




<210> 16 
















<211> 1164 
















<212> DNA 
















<213> Arabidopsis sp 












10 


















<400> 16 
















atgacttcga 


ttctcaacac 


tgtctccacc 


atccactctt 


ccagagttac 


ctccgtcgat 


60 




c era cr t c cr cr a cr 


tcctctctct 


teggaatteg 


gattccgttg 


agttcactcg 


ccggcgttct 


120 




ggtt tc tcga 


cgttgatcta 


cgaatcaccc 


gggeggagat 


ttgttgtgcg 


tgeggeggag 


180 


15 


actgatactg 


ataaagttaa 


atctcagaca 


cctgacaagg 


caccagccgg 


tggttcaagc 


240 




at taaccagc 


ttcteggtat 


caaaggagca 


tctcaagaaa 


ctaataaatg 


gaagattcgt 


300 




cttcagctta 


caaaaccagt 


cacttggcct 


ccactggttt 


ggggagtcgt 


ctgtggtgct 


360 




getgettcag 


ggaactttca 


ttggacccca 


gaggatgttg 


ctaagtcgat 


tetttgeatg 


420 




atgatgtctg 


gtccttgtct 


tactggctat 


acacagacaa 


tcaacgactg 


gtatgataga 


480 


ip 


gatatcgacg 


caattaatga 


gecatategt 


ccaattccat 


ctggagcaat 


ateagageca 


540 




gaggttatta 


cacaagtctg 


ggtgctatta 


ttgggaggtc 


ttggtattgc 


tggaatatta 


600 


m 


cr a. t cr t cr t cr cr cr 


cagggcatac 


cactcccact 


gtcttctatc 


ttgctttggg 


aggatcattg 


660 


jp. 


ctatcttata 


tatactctgc 


tccacctctt 


aagctaaaac 


aaaatggatg 


ggttggaaat 


720 




tttgcacttg 


gagcaagcta 


tattagtttg 


ccatggtggg 


ctggccaagc 


attgtttggc 


780 


w 


actcttacgc 


cagatgttgt 


tgttctaaca 


ctcttgtaca 


gcatagctgg 


gttaggaata 


840 




gccattgtta 


acgacttcaa 


aagtgttgaa 


ggagatagag 


cattaggact 


tcagtctctc 


900 




ccagtagctt 


ttggcaccga 


aactgeaaaa 


tggatatgcg 


ttggtgctat 


agacattact 


960 




cagctttctg 


ttgeeggata 


tctattagca 


tctgggaaac 


ettattatge 


gttggcgttg 


1 no n 




gttgctttga 


tcattcctca 


gattgtgttc 


cagtttaaat 


actttctcaa 


ggaccctgtc 


1080 


To 


aaatacgacg 


tcaagtacca 


ggcaagcgcg 


cagccattct 


tggtgctcgg 


aatatttgta 


1140 




aeggcattag 


catcgcaaca 


ctga 








1164 




<210> 17 
















<211> 387 














35 


<212> PRT 
















<213> Arabidopsis sp 














<400> 17 
















Met Thr Ser lie Leu Asn Thr Val 


Ser Thr lie 


His Ser Ser Arg Val 




40 


1 


5 




10 




15 






Thr Ser Val Asp Arg Val Gly Val : 


Leu Ser Leu 


Arg Asn Ser Asp Ser 








20 


25 


30 








Val Glu Phe Thr Arg Arg Arg Ser < 


Gly Phe Ser 


Thr Leu lie Tyr Glu 
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35 40 45 

Ser Pro Gly Arg Arg Phe Val Val Arg Ala Ala Glu Thr Asp Thr Asp 

50 55 60 

Lys Val Lys Ser Gin Thr Pro Asp Lys Ala Pro Ala Gly Gly Ser Ser 
5 65 70 75 80 

lie Asn Gin Leu Leu Gly lie Lys Gly Ala Ser Gin Glu Thr Asn Lys 

85 90 95 

Trp Lys lie Arg Leu Gin Leu Thr Lys Pro Val Thr Trp Pro Pro Leu 
100 105 110 

10 Val Trp Gly Val Val Cys Gly Ala Ala Ala Ser Gly Asn Phe His Trp 
115 120 125 

Thr Pro Glu Asp Val Ala Lys Ser lie Leu Cys Met Met Met Ser Gly 

130 135 140 

Pro Cys Leu Thr Gly Tyr Thr Gin Thr lie Asn Asp Trp Tyr Asp Arg 
15 145 150 155 160 

Asp lie Asp Ala lie Asn Glu Pro Tyr Arg Pro lie Pro Ser Gly Ala 
n 165 170 175 

j% lie Ser Glu Pro Glu Val lie Thr Gin Val Trp Val Leu Leu Leu Gly 
||] 180 185 190 

5R) Gly Leu Gly lie Ala Gly lie Leu Asp Val Trp Ala Gly His Thr Thr 
195 200 205 

Pro Thr Val Phe Tyr Leu Ala Leu Gly Gly Ser Leu Leu Ser Tyr lie 
210 215 220 

8 Tyr Ser Ala Pro Pro Leu Lys Leu Lys Gin Asn Gly Trp Val Gly Asn 

IS 225 230 235 240 

Hh Phe Ala Leu Gly Ala Ser Tyr He Ser Leu Pro Trp Trp Ala Gly Gin 
^ 245 250 255 

S Ala Leu Phe Gly Thr Leu Thr Pro Asp Val Val Val Leu Thr Leu Leu 
Q 260 265 270 

3 0 Tyr Ser He Ala Gly Leu Gly lie Ala He Val Asn Asp Phe Lys Ser 
275 280 285 

Val Glu Gly Asp Arg Ala Leu Gly Leu Gin Ser Leu Pro Val Ala Phe 

290 295 300 

Gly Thr Glu Thr Ala Lys Trp He Cys Val Gly Ala He Asp He Thr 
35 305 310 315 320 

Gin Leu Ser Val Ala Gly Tyr Leu Leu Ala Ser Gly Lys Pro Tyr Tyr 

325 330 335 

Ala Leu Ala Leu Val Ala Leu He He Pro Gin He Val Phe Gin Phe 
340 345 350 

40 Lys Tyr Phe Leu Lys Asp Pro Val Lys Tyr Asp Val Lys Tyr Gin Ala 
355 360 365 

Ser Ala Gin Pro Phe Leu Val Leu Gly He Phe Val Thr Ala Leu Ala 
370 375 380 
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Ser Gin His 
385 



<210> 18 
5 <211> 981 
<212> DNA 

<213> Arabidopsis sp 



60 
120 
180 



<400> 18 

10 atgttgttta gtggttcagc gatcccatta agcagcttct gctctcttcc ggagaaaccc 
cacactcttc ctatgaaact ctctcccgct gcaatccgat cttcatcctc atctgccccg 
gggtcgttga acttcgatct gaggacgtat tggacgactc tgatcaccga gatcaaccag 
aagctggatg aggccatacc ggtcaagcac cctgcgggga tctacgaggc tatgagatac 240 
tctgtactcg cacaaggcgc caagcgtgcc cctcctgtga tgtgtgtggc ggcctgcgag 300 
15 ctcttcggtg gcgatcgcct cgccgctttc cccaccgcct gtgccctaga aatggtgcac 360 

gcggcttcgt tgatacacga cgacctcccc tgtatggacg acgatcctgt gcgcagagga 
O aagccatcta accacactgt ctacggctct ggcatggcca ttctcgccgg tgacgccctc 
J3 ttcccactcg ccttccagca cattgtctcc cacacgcctc ctgaccttgt tccccgagcc 
^} accatcctca gactcatcac tgagattgcc cgcactgtcg gctccactgg tatggctgca 
*%0 ggccagtacg tcgaccttga aggaggtccc tttcctcttt cctttgttca ggagaagaaa 
m ttcggagcca tgggtgaatg ctctgccgtg tgcggtggcc tattgggcgg tgccactgag 72 0 

JC gatgagctcc agagtctccg aaggtacggg agagccgtcg ggatgctgta tcaggtggtc 780 
CO gatgacatca ccgaggacaa gaagaagagc tatgatggtg gagcagagaa gggaatgatg 
^ gaaatggcgg aagagctcaa ggagaaggcg aagaaggagc ttcaagtgtt tgacaacaag 
^5 tatggaggag gagacacact tgttcctctc tacaccttcg ttgactacgc tgctcatcga 
jy. cattttcttc ttcccctctg a 

%j <210> 19 
O <211> 245 
3 0 <212> DNA 

<213> GLycine sp 

<400> 19 

gcaacatctg ggactgggtt tgtcttgggg agtggtagtg ctgttgatct ttcggcactt 60 

3 5 tcttgcactt gcttgggtac catgatggtt gctgcatctg ctaactcttt gaatcaggtg 120 

tttgagatca ataatgatgc taaaatgaag agaacaagtc gcaggccact accctcagga 

cgcatcacaa tacctcatgc agttggctgg gcatcctctg ttggattagc tggtacggct 
ctact 



420 
480 
540 
600 
660 



840 
900 
960 
981 



180 
240 
245 



40 <210> 20 

<211> 253 

<212> DNA 

<213> Glycine sp 



15 



Attorney Docket No: 
17133/02/US 



<400> 20 

attggctttc caagatcatt gggttttctt gttgcattca tgaccttcta ctccttgggt 
ttggcattgt ccaaggatat acctgacgtt gaaggagata aagagcacgg cattgattct 
5 tttgcagtac gtctaggtca gaaacgggca ttttggattt gcgtttcctt ttttgaaatg 
gctttcggag ttggtatcct ggccggagca tcatgctcac acttttggac taaaattttc 
acgggtatgg gaa 



<210> 21 
10 <211> 275 
<212> DNA 
<213> Glycine sp 



<400> 21 

15 tgatcttcta ctctctgggt atggcattgt ccaaggatat atctgacgtt aaaggagata 



%0 



30 



tctttgtggg tatggcattg gcaaaggata tacctanctg ttgaaggaga taaaatatat 
3 5 ggcattgata cttttgcaat acgtataggt caaaaacaag tattttggat ttgtattttc 



60 



180 
240 



aagcatacgg catcgatact ttagcgatac gtttgggtca aaaatgggta ttttggattt 120 
gcattatcct ttttgaaatg gcttttggag ttgccctctt ggcaggagca acatcttctt 
acctttggat taaaattgtc acgggtctgg gacatgctat tcttgcttca attctcttgt 
accaagccaa atctatatac ttgagcaaca aagtt 275 

<210> 22 
<211> 299 
<212> DNA 
<213> Glycine sp 

<220> 

<221> misc_feature 
<222> (1) . . . (299) 
<223> n = A,T,C or G 

<400> 22 

ccanaatang tncatcttng aaagacaatt ggcctcttca acacacaagt ctgcatgtga 60 
agaagaggcc aattgtcttt ccaagatcac ttatngtggc tattgtaatc atgaacttct 120 

180 
240 



ctttttgaaa ggctttcgga gtttccctag tggcaggagc aacatcttct agccttggt 299 



<210> 23 

<211> 767 

40 <212> DNA 

<213> Glycine sp 



<400> 23 
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gtggaggctg tggttgctgc cctgtttatg aatatttata ttgttggttt gaatcaattg 60 

tctgatgttg aaatagacaa gataaacaag ccgtatcttc cattagcatc tggggaatat 120 

tcctttgaaa ctggtgtcac tattgttgca tctttttcaa ttctgagttt ttggcttggc 180 

tgggttgtag gttcatggcc attattttgg gccctttttg taagctttgt gctaggaact 240 

5 gcttattcaa tcaatgtgcc tctgttgaga tggaagaggt ttgcagtgct tgcagcgatg 3 00 

tgcattctag ctgttcgggc agtaatagtt caacttgcat ttttccttca catgcagact 3 60 

catgtgtaca agaggccacc tgtcttttca agaccattga tttttgctac tgcattcatg 420 

agcttcttct ctgtagttat agcactgttt aaggatatac ctgacattga aggagataaa 480 

gtatttggca tccaatcttt ttcagtgtgt ttaggtcaga agccggtgtt ctggacttgt 540 

10 gttacccttc ttgaaatagc ttatggagtc gccctcctgg tgggagctgc atctccttgt 600 

ctttggagca aaattttcac gggtctggga cacgctgtgc tggcttcaat tctctggttt 660 

catgccaaat ctgtagattt gaaaagcaaa gcttcgataa catccttcta tatgtttatt 720 

tggaagctat tttatgcaga atacttactc attccttttg ttagatg 7 67 

15 <210> 24 

<211> 255 
O <212> PRT 

<213> Glycine sp 



g Val Glu Ala Val Val Ala Ala Leu Phe Met Asn lie Tyr lie Val Gly 

42 1 5 10 15 

r *V Leu Asn Gin Leu Ser Asp Val Glu lie Asp Lys lie Asn Lys Pro Tyr 

L 20 25 30 

"?5 Leu Pro Leu Ala Ser Gly Glu Tyr Ser Phe Glu Thr Gly Val Thr lie 

2, 35 40 45 

Val Ala Ser Phe Ser lie Leu Ser Phe Trp Leu Gly Trp Val Val Gly 
□ 50 55 60 

Ser Trp Pro Leu Phe Trp Ala Leu Phe Val Ser Phe Val Leu Gly Thr 
30 65 70 75 80 

Ala Tyr Ser lie Asn Val Pro Leu Leu Arg Trp Lys Arg Phe Ala Val 

85 90 95 

Leu Ala Ala Met Cys He Leu Ala Val Arg Ala Val He Val Gin Leu 
100 105 110 

3 5 Ala Phe Phe Leu His Met Gin Thr His Val Tyr Lys Arg Pro Pro Val 
115 120 125 

Phe Ser Arg Pro Leu He Phe Ala Thr Ala Phe Met Ser Phe Phe Ser 

130 135 140 

Val Val He Ala Leu Phe Lys Asp He Pro Asp He Glu Gly Asp Lys 
40 145 150 155 160 

Val Phe Gly He Gin Ser Phe Ser Val Cys Leu Gly Gin Lys Pro Val 

165 170 175 

Phe Trp Thr Cys Val Thr Leu Leu Glu He Ala Tyr Gly Val Ala Leu 
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180 185 190 





Leu Val Gly Ala Ala Ser Pro Cys 


Leu Trp Ser 


Lys lie Phe Thr Gly 








195 200 




205 






Leu Gly His Ala Val Leu Ala Ser 


lie Leu Trp 


Phe His Ala Lys Ser 




r- 

5 


210 215 




220 






Val Asp Leu Lys Ser Lys Ala Ser 


lie Thr Ser 


Phe Tyr Met Phe lie 






225 


230 


235 


240 






Trp Lys Leu Phe Tyr Ala Glu Tyr 


Leu Leu lie 


Pro Phe Val Arg 








245 


250 


255 




10 














<210> 


25 










<211> 


360 










<212> 


DNA 










<213> 


Zea sp 








15 














<220> 












<221> 


misc_f eature 










<222> 


{!)... (360) 










<223> 


n = A,T,C or G 










<400> 


25 










ggcgtcttca cttgttctgg tcttctcgta 


tcccctgatg 


aagaggttca cattttggcc 


60 




tcaggcttat cttggcctga cattcaactg 


gggctgcttta. 


ctagggtggg ctgctattaa 


120 




ggaaagcata gaccctgcaa atcatccttc 


cattgtatac 


agctggtatt tgttggacgc 


180 


25 


tggtgtatga tactatatat gcgcatcagg 


tgtttcgcta 


tccctacttt catattaatc 


240 




cttgatgaag tggccatttc atgttgtcgc 


ggtggtctta 


tacttgcata tctccatgca 


300 




tctcaggaca aagangatga cctgaaagta 


ggagtccaag 


tccacagctt aagatttggg 


360 




<210> 


26 








30 


<211> 


299 










<212> 


DNA 










<213> 


Zea sp 










<220> 










5 D 


<221> 


misc_f eature 










<222> 


(1) . . . (299) 










<223> 


n = A,T,C or G 










<400> 


26 








40 


gatggttgca gcatctgcaa ataccctcaa 


ccaggtgttt 


gngataaaaa atgatgctaa 


60 




aatgaaaagg acaatgcgtg ccccctgcca 


tctggtcgca 


ttagtcctgc acatgctgcg 


120 




atgtgggcta caagtgttgg agttgcagga 


acagctttgt 


tggcctggaa ggctaatggc 


180 




ttggcagctg ggcttgcagc ttctaatctt 


gttctgtatg 


catttgtgta tacgccgttg 


240 
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aagcaaatac accctgttaa tacatgggtt ggggcagtcg ttggtgccat cccaccact 29< 



<210> 27 
<211> 255 
<212> DNA 
<213> Zea sp 



<220> 

<221> misc__feature 
<222> (1) . . . (255) 
<223> n = A,T,C or G 



60 



<400> 27 

anacttgcat atctccatgc ntctcaggac aaagangatg acctgaaagt aggtgtcaag 
tccacagcat taagatttgg agatttgacc nnatactgna tcagtggctt tggcgcggca 120 
tgcttcggca gcttagcact cagtggttac aatgctgacc ttggttggtg tttagtgtga 180 
tgcttgagcg aagaatggta tngtttttac ttgatattga ctccagacct gaaatcatgt 240 

255 

tggacagggt ggccc 



<210> 28 
<211> 257 
<212> DNA 
<213> Zea sp 



<400> 28 

attgaagggg ataggactct ggggcttcag tcacttcctg ttgcttttgg gatggaaact 
gcaaaatgga tttgtgttgg agcaattgat atcactcaat tatctgttgc aggttaccta 
ttgagcaccg gtaagctgta ttatgccctg gtgttgcttg ggctaacaat tcctcaggtg 
ttctttcagt tccagtactt cctgaaggac cctgtgaagt atgatgtcaa atatcaggca 
agcgcacaac cattctt 



<210> 29 
<211> 368 
<212> DNA 
<213> Zea sp 



60 



<400> 29 

atccagttgc aaataataat ggcgttcttc tctgttgtaa tagcactatt caaggatata 

cctgacatcg aaggggaccg catattcggg atccgatcct tcagcgtccg gttagggcaa 120 

aagaaggtct tttggatctg cgttggcttg cttgagatgg cctacagcgt tgcgatactg 180 

atgggagcta cctcttcctg tttgtggagc aaaacagcaa ccatcgctgg ccattccata 240 

cttgccgcga tcctatggag ctgcgcgcga tcggtggact tgacgagcaa agccgcaata 300 

acgtccttct acatgttcat ctggaagctg ttctacgcgg agtacctgct catccctctg 3 60 
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gtgcggtg 

<210> 30 
<211> 122 
<212> PRT 
<213> Zea sp 

<400> 30 

He Gin Leu Gin He He Met Ala Phe 

1 5 
Phe Lys Asp He Pro Asp He Glu Gly 

20 25 
Ser Phe Ser Val Arg Leu Gly Gin Lys 

35 40 
Gly Leu Leu Glu Met Ala Tyr Ser Val 

50 55 
Ser Ser Cys Leu Trp Ser Lys Thr Ala 
65 70 
Leu Ala Ala He Leu Trp Ser Cys Ala 
85 

Lys Ala Ala He Thr Ser Phe Tyr Met 
100 105 
Ala Glu Tyr Leu Leu He Pro Leu Val 
115 120 

<210> 31 
<211> 278 
<212> DNA 
<213> Zea sp 



<400> 31 














tattcagcac 


cacctctcaa 


gctcaagcag 


aatggatgga 


ttgggaactt 


cgctctgggt 


60 


gcgagttaca 


tcagcttgcc 


ctggtgggct 


ggccaggcgt 


tatttggaac 


tcttacacca 


120 


gatatcattg 


tcttgactac 


tttgtacagc 


atagctgggc 


tagggattgc 


tattgtaaat 


180 


gatttcaaga 


gtattgaagg 


ggataggact 


ctggggcttc 


agtcacttcc 


tgttgctttt 


240 


gggatggaaa 


ctgcaaaatg 


gatttgtgtt 


ggagcaat ' 






278 



<210> 32 
<211> 292 
<212> PRT 

<213> Synechocystis sp 



Phe Ser Val Val He Ala Leu 
10 15 
Asp Arg He Phe Gly He Arg 
30 

Lys Val Phe Trp He Cys Val 
45 

Ala He Leu Met Gly Ala Thr 
60 

Thr He Ala Gly His Ser He 

75 80 
Arg Ser Val Asp Leu Thr Ser 
90 95 
Phe He Trp Lys Leu Phe Tyr 
110 

Arg 



<400> 32 
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Met Val Ala Gin Thr Pro Ser Ser Pro Pro Leu Trp Leu Thr lie lie 

15 10 15 

Tyr Leu Leu Arg Trp His Lys Pro Ala Gly Arg Leu lie Leu Met lie 
20 25 30 

5 Pro Ala Leu Trp Ala Val Cys Leu Ala Ala Gin Gly Leu Pro Pro Leu 
35 40 45 

Pro Leu Leu Gly Thr lie Ala Leu Gly Thr Leu Ala Thr Ser Gly Leu 

50 55 60 

Gly Cys Val Val Asn Asp Leu Trp Asp Arg Asp lie Asp Pro Gin Val 
10 65 70 75 80 

Glu Arg Thr Lys Gin Arg Pro Leu Ala Ala Arg Ala Leu Ser Val Gin 

85 90 95 

Val Gly lie Gly Val Ala Leu Val Ala Leu Leu Cys Ala Ala Gly Leu 
100 105 110 

15 Ala Phe Tyr Leu Thr Pro Leu Ser Phe Trp Leu Cys Val Ala Ala Val 

115 120 125 

^ Pro Val He Val Ala Tyr Pro Gly Ala Lys Arg Val Phe Pro Val Pro 

13 0 135 140 

X: Gin Leu Val Leu Ser He Ala Trp Gly Phe Ala Val Leu He Ser Trp 
: i0 145 150 155 160 

m ser Ala Val Thr Gly Asp Leu Thr Asp Ala Thr Trp Val Leu Trp Gly 
% 165 170 175 

7 Ala Thr Val Phe Trp Thr Leu Gly Phe Asp Thr Val Tyr Ala Met Ala 
Q ISO 185 190 

MS Asp Arg Glu Asp Asp Arg Arg He Gly Val Asn Ser Ser Ala Leu Phe 

195 200 205 

;t Phe Gly Gin Tyr Val Gly Glu Ala Val Gly He Phe Phe Ala Leu Thr 
210 215 220 

He Gly Cys Leu Phe Tyr Leu Gly Met He Leu Met Leu Asn Pro Leu 
30 225 230 235 240 

Tyr Trp Leu Ser Leu Ala He Ala He Val Gly Trp Val He Gin Tyr 

245 250 255 

He Gin Leu Ser Ala Pro Thr Pro Glu Pro Lys Leu Tyr Gly Gin He 
260 265 270 

35 Phe Gly Gin Asn Val He He Gly Phe Val Leu Leu Ala Gly Met Leu 
275 280 285 

Leu Gly Trp Leu 
290 



40 <210> 33 

<211> 316 

<212> PRT 

<213> Synechocystis sp 
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<400> 33 

Met Val Thr Ser Thr Lys lie His Arg Gin His Asp Ser Met Gly Ala 
15 10 15 

5 Val Cys Lys Ser Tyr Tyr Gin Leu Thr Lys Pro Arg lie He Pro Leu 
20 25 30 

Leu Leu He Thr Thr Ala Ala Ser Met Trp He Ala Ser Glu Gly Arg 

35 40 45 

Val Asp Leu Pro Lys Leu Leu He Thr Leu Leu Gly Gly Thr Leu Ala 
10 50 55 60 

Ala Ala Ser Ala Gin Thr Leu Asn Cys He Tyr Asp Gin Asp He Asp 
65 70 75 80 

Tyr Glu Met Leu Arg Thr Arg Ala Arg Pro He Pro Ala Gly Lys Val 
85 90 95 

15 Gin Pro Arg His Ala Leu He Phe Ala Leu Ala Leu Gly Val Leu Ser 

100 105 110 

h£ Phe Ala Leu Leu Ala Thr Phe Val Asn Val Leu Ser Gly Cys Leu Ala 
115 120 125 

Leu Ser Gly He Val Phe Tyr Met Leu Val Tyr Thr His Trp Leu Lys 
AO 130 135 140 

Arg His Thr Ala Gin Asn He Val He Gly Gly Ala Ala Gly Ser He 
HN 145 150 155 160 

^ Pro Pro Leu Val Gly Trp Ala Ala Val Thr Gly Asp Leu Ser Trp Thr 
p. 165 170 175 

j|;5 Pro Trp Val Leu Phe Ala Leu He Phe Leu Trp Thr Pro Pro His Phe 
M 180 185 190 

4=; Tr p A ]_ a Leu A ]_ a Leu Me t He Lys Asp Asp Tyr Ala Gin Val Asn Val 
195 200 205 

Pro Met Leu Pro Val He Ala Gly Glu Glu Lys Thr Val Ser Gin He 
30 210 215 220 

Trp Tyr Tyr Ser Leu Leu Val Val Pro Phe Ser Leu Leu Leu Val Tyr 
225 230 235 240 

Pro Leu His Gin Leu Gly He Leu Tyr Leu Ala He Ala He He Leu 
245 250 255 

35 Gly Gly Gin Phe Leu Val Lys Ala Trp Gin Leu Lys Gin Ala Pro Gly 
260 265 270 

Asp Arg Asp Leu Ala Arg Gly Leu Phe Lys Phe Ser He Phe Tyr Leu 

275 280 285 

Met Leu Leu Cys Leu Ala Met Val He Asp Ser Leu Pro Val Thr His 
40 290 295 300 

Gin Leu Val Ala Gin Met Gly Thr Leu Leu Leu Gly 
305 310 315 
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<210> 34 
<211> 324 
<212> PRT 

<213> Synechocystis sp 

5 

<400> 34 

Met Ser Asp Thr Gin Asn Thr Gly Gin Asn Gin Ala Lys Ala Arg Gin 

15 10 15 

Leu Leu Gly Met Lys Gly Ala Ala Pro Gly Glu Ser Ser lie Trp Lys 
10 20 25 30 

lie Arg Leu Gin Leu Met Lys Pro lie Thr Trp lie Pro Leu lie Trp 

35 40 45 

Gly Val Val Cys Gly Ala Ala Ser Ser Gly Gly Tyr lie Trp Ser Val 
50 55 60 

15 Glu Asp Phe Leu Lys Ala Leu Thr Cys Met Leu Leu Ser Gly Pro Leu 
^ 65 70 75 80 

71 Met Thr Gly Tyr Thr Gin Thr Leu Asn Asp Phe Tyr Asp Arg Asp lie 

85 90 95 

^ Asp Ala lie Asn Glu Pro Tyr Arg Pro lie Pro Ser Gly Ala lie Ser 
'W 100 105 110 

W Val Pro Gin Val Val Thr Gin lie Leu lie Leu Leu Val Ala Gly lie 
+• 115 120 125 

G ]_y val Ala Tyr Gly Leu Asp Val Trp Ala Gin His Asp Phe Pro lie 
Q 130 135 140 

*{?5 Met Met Val Leu Thr Leu Gly Gly Ala Phe Val Ala Tyr lie Tyr Ser 
7* 145 150 155 160 

Ala Pro Pro Leu Lys Leu Lys Gin Asn Gly Trp Leu Gly Asn Tyr Ala 

165 170 175 

Leu Gly Ala Ser Tyr lie Ala Leu Pro Trp Trp Ala Gly His Ala Leu 
30 180 185 190 

Phe Gly Thr Leu Asn Pro Thr lie Met Val Leu Thr Leu lie Tyr Ser 

195 200 205 

Leu Ala Gly Leu Gly He Ala Val Val Asn Asp Phe Lys Ser Val Glu 
210 215 220 

35 Gly Asp Arg Gin Leu Gly Leu Lys Ser Leu Pro Val Met Phe Gly He 
225 230 235 240 

Gly Thr Ala Ala Trp He Cys Val He Met He Asp Val Phe Gin Ala 

245 250 255 

Gly He Ala Gly Tyr Leu He Tyr Val His Gin Gin Leu Tyr Ala Thr 
40 260 265 270 

He Val Leu Leu Leu Leu He Pro Gin He Thr Phe Gin Asp Met Tyr 

275 280 285 

Phe Leu Arg Asn Pro Leu Glu Asn Asp Val Lys Tyr Gin Ala Ser Ala 
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290 295 300 

Gin Pro Phe Leu Val Phe Gly Met Leu Ala Thr Gly Leu Ala Leu Gly 
305 310 315 320 

His Ala Gly He 



<210> 35 

<211> 307 

<212> PRT 

10 <213> Synechocystis sp 

<400> 35 

Met Thr Glu Ser Ser Pro Leu Ala Pro Ser Thr Ala Pro Ala Thr Arg 
15 10 15 

15 Lys Leu Trp Leu Ala Ala He Lys Pro Pro Met Tyr Thr Val Ala Val 

20 25 30 

U val Pro He Thr Val Gly Ser Ala Val Ala Tyr Gly Leu Thr Gly Gin 
J^! 35 40 45 

Trp His Gly Asp Val Phe Thr He Phe Leu Leu Ser Ala He Ala He 
Sp 50 55 60 

£3 He Ala Trp He Asn Leu Ser Asn Asp Val Phe Asp Ser Asp Thr Gly 
f* 65 70 75 80 

He Asp Val Arg Lys Ala His Ser Val Val Asn Leu Thr Gly Asn Arg 
85 90 95 

j§5 Asn Leu Val Phe Leu He Ser Asn Phe Phe Leu Leu Ala Gly Val Leu 
U 100 105 110 

■5 Gly Leu Met Ser Met Ser Trp Arg Ala Gin Asp Trp Thr Val Leu Glu 
^ 115 120 125 

™ Leu He Gly Val Ala He Phe Leu Gly Tyr Thr Tyr Gin Gly Pro Pro 
30 130 135 140 

Phe Arg Leu Gly Tyr Leu Gly Leu Gly Glu Leu He Cys Leu He Thr 
145 150 155 160 

Phe Gly Pro Leu Ala He Ala Ala Ala Tyr Tyr Ser Gin Ser Gin Ser 
165 170 175 

35 Phe Ser Trp Asn Leu Leu Thr Pro Ser Val Phe Val Gly He Ser Thr 
180 185 190 

Ala He He Leu Phe Cys Ser His Phe His Gin Val Glu Asp Asp Leu 

195 200 205 

Ala Ala Gly Lys Lys Ser Pro He Val Arg Leu Gly Thr Lys Leu Gly 
40 210 215 220 

Ser Gin Val Leu Thr Leu Ser Val Val Ser Leu Tyr Leu He Thr Ala 
225 230 235 240 

He Gly Val Leu Cys His Gin Ala Pro Trp Gin Thr Leu Leu He He 
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10 



15 



245 250 255 

Ala Ser Leu Pro Trp Ala Val Gin Leu He Arg His Val Gly Gin Tyr 

260 265 270 

His Asp Gin Pro Glu Gin Val Ser Asn Cys Lys Phe He Ala Val Asn 

275 280 285 

Leu His Phe Phe Ser Gly Met Leu Met Ala Ala Gly Tyr Gly Trp Ala 

290 295 300 

Gly Leu Gly 
305 

<210> 36 
<211> 927 
<212> DNA 

<213> Synechocystis sp 



<400> 36 

0 atggcaacta tccaagcttt ttggcgcttc tcccgccccc ataccatcat tggtacaact 
5 ctgagcgtct gggctgtgta tctgttaact attctcgggg atggaaactc agttaactcc 
W cctgcttccc tggatttagt gttcggcgct tggctggcct gcctgttggg taatgtgtac 
ft attgtcggcc tcaaccaatt gtgggatgtg gacattgacc gcatcaataa gccgaatttg 
S cccctagcta acggagattt ttctatcgcc cagggccgtt ggattgtggg actttgtggc 
2 gttgcttcct tggcgatcgc ctggggatta gggctatggc tggggctaac ggtgggcatt 
% agtttgatta ttggcacggc ctattcggtg ccgccagtga ggttaaagcg cttttccctg 

1 ctggcggccc tgtgtattct gacggtgcgg ggaattgtgg ttaacttggg cttattttta 
§5 ttttttagaa ttggtttagg ttatcccccc actttaataa cccccatctg ggttttgact 
j ttatttatct tagttttcac cgtggcgatc gccattttta aagatgtgcc agatatggaa 
j« ggcgatcggc aatttaagat tcaaacttta actttgcaaa tcggcaaaca aaacgttttt 

5 -cggggaacct taattttact cactggttgt tatttagcca tggcaatctg gggcttatgg 

6 gcggctatgc ctttaaatac tgctttcttg attgtttccc atttgtgctt attagcctta 
3 0 ctctggtggc ggagtcgaga tgtacactta gaaagcaaaa ccgaaattgc tagtttttat 

cagtttattt ggaagctatt tttcttagag tacttgctgt atcccttggc tctgtggtta 
cctaattttt ctaatactat tttttag 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
927 



<210> 37 
35 <211> 308 
<212> PRT 

<213> Synechocystis sp 
<400> 37 

40 Met Ala Thr He Gin Ala Phe Trp 
1 5 
He Gly Thr Thr Leu Ser Val Trp 
20 



Arg Phe Ser Arg Pro His Thr He 

10 15 
Ala Val Tyr Leu Leu Thr He Leu 
25 30 
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Gly Asp Gly Asn Ser Val Asn Ser Pro Ala Ser Leu Asp Leu Val Phe 

35 40 45 

Gly Ala Trp Leu Ala Cys Leu Leu Gly Asn Val Tyr lie Val Gly Leu 
50 55 60 

5 Asn Gin Leu Trp Asp Val Asp lie Asp Arg lie Asn Lys Pro Asn Leu 
65 70 75 80 

Pro Leu Ala Asn Gly Asp Phe Ser lie Ala Gin Gly Arg Trp lie Val 

85 90 95 

Gly Leu Cys Gly Val Ala Ser Leu Ala lie Ala Trp Gly Leu Gly Leu 
10 100 105 110 

Trp Leu Gly Leu Thr Val Gly lie Ser Leu lie lie Gly Thr Ala Tyr 

115 120 125 

Ser Val Pro Pro Val Arg Leu Lys Arg Phe Ser Leu Leu Ala Ala Leu 
130 135 140 

15 Cys lie Leu Thr Val Arg Gly lie Val Val Asn Leu Gly Leu Phe Leu 
145 150 155 160 

O Phe Phe Arg lie Gly Leu Gly Tyr Pro Pro Thr Leu lie Thr Pro lie 
tf? 165 170 175 

Trp Val Leu Thr Leu Phe lie Leu Val Phe Thr Val Ala lie Ala lie 
%D 180 185 190 

ffi Phe Lys Asp Val Pro Asp Met Glu Gly Asp Arg Gin Phe Lys lie Gin 
4? 195 200 205 

ffi Thr Leu Thr Leu Gin lie Gly Lys Gin Asn Val Phe Arg Gly Thr Leu 
^ 210 215 220 

''2:5 lie Leu Leu Thr Gly Cys Tyr Leu Ala Met Ala lie Trp Gly Leu Trp 

225 230 235 240 

jr Ala Ala Met Pro Leu Asn Thr Ala Phe Leu lie Val Ser His Leu Cys 
□ • 245 250 255 

yi Leu Leu Ala Leu Leu Trp Trp Arg Ser Arg Asp Val His Leu Glu Ser 
30 260 265 270 

Lys Thr Glu lie Ala Ser Phe Tyr Gin Phe lie Trp Lys Leu Phe Phe 
275 280 285 

Leu Glu Tyr Leu Leu Tyr Pro Leu Ala Leu Trp Leu Pro Asn Phe Ser 
290 295 300 

35 Asn Thr lie Phe 

305 

<210> 38 
<211> 1092 
40 <212> DNA 

<213> Synechocystis sp 

<400> 38 
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9. t gaaa 1 1 1 c 


CyCCCCaCag 


cggc uaccau 


uggcaaggcc 


— > j-% s—t s~* 4— 4~~ 4~~ 


*-\ 4- +— 4— r~x -n a , /*t+- 

CCUCCfaayyt 


o u 




tggtacgtgc 


gc c t gc 1 1 1 1 


gccccaaccc 


ggggaaagc u 




y CaC UCCatC 






ri **n n f* V* y-* +* 

yaaaaCCCCy 


f~t\- -rs r*r c+rt r^i 4— Ti 

CLayCydLCa 


i"/^ a t* t* 2 rrrr/-* 

CCd l uacyy c 


y y cy y i-y c uy 




^y y y (, - ( *-y y ^ 


-LOW 




^ *^ --\ ^ ^ "^1 

aCSfaaaaaaC 


^ ^ /T ^ ^ 2 ^ ^ ^ 

aayaaaatCd 


ygaaydCCaa 


/"■* ^ <t 4-* 4-- 4— f** 


rrrta /""* ;a 4— 4— 4— /-» /-i 

yyaCaLtCCC 


/■^ 4— s~\ r-r r^r 4— a ^ ^ 

Lcyy Laaad. 


9 AO 
z *± u 


Zj 


a. a. 3. 1 1 1 t ggg 


ccagtcctcg 


ccay tccycc 


-1— ^ /Y/T^fri *^ +-~ 4"-' 

cuayyycauL 


gggyaaaaug 


tagy y ataac 


T AA 
JUU 




aggcaggcga 


SaCCCCtaCu 


c uccgaagaa 


Lututuycca 


cyy tcaayya 


ay y L. Uattaa 






4— « /-i ^ +- *-t ^ ^ ^ 


■-v 4— /— « -3 f-\ ^ /-t n o 

at-CayCaCCa 


ayyo-CaaaLC 


atucatyycy 


a h/^ 3 j- f- <-~<- 

aLCyt,Cat.ty 


ucgctgy cay 






t tcaccgtag 


aaccggaagt 


aacttggggg 


agtcc taacc 


gat t tec teg 


ggc uacagcg 


4t o U 




ggt tggcttt 


/"i "l~ 4-* +~ "l~ ^% r^t 


/^i +- -f- *-*r+- -t- /-vr -a -t- 

CuUyLL tyaL 


cccggt fcggc 


aaac cc ecu u 


ageccaay g u 


RAO 


1 n 


aga.gcgca.cg 


gctggctgaa 


atggcagagg 


gaacagtatg 


aacu cy acca 


cgccc uay uu 


DUU 




tatgccgaaa 


aaaattgggg 


tcactcct tt 


ccctcccgct 


ggttttggct 


ccaagcaaat 


DO U 




tattutcctg 


accatccagg 


actgagcgtc 


actgccgctg 


gcggggaacg 


jrv-s. 4-4-i^-4— 4-/-i4-4- 


■7 0 0 




ggtcgccccg 


aagaggtagc 


tttaattggc 


ttacatcacc 


aaggtaattt 


ttacgaattt 






ggcccgggcc 


atggcacagt 


cacttggcaa 


gtagctccct 


g&ggccgttg 


gcaattaaaa 


o yi n 


XD 


gccagcaatg 


ataggtattg 


ggtcaagttg 


tccggaaaaa 


cagataaaaa 


aggcagc tea 


q n n 




gtccacactc 


ccaccgccca 


gggcttacaa 


ctcaactgcc 


gagataccac 


taggggctat 


960 




ttgtatttgc 


aattgggatc 


tgtgggtcac 


ggcctgatag 


tgcaagggga 


aacggacacc 


1020 




gcggggctag 


aagttggagg 


tgattggggt 


ttaacagagg 


aaaatttgag 


caaaaaaaca 


1080 




gtgccattct 


ga 










1092 



a S <210> 39 
P <211> 363 
<212> PRT 
^ <213> Synechocystis sp 

I 5 

r 1 " <400> 39 

V Met Lys Phe Pro Pro His Ser Gly Tyr His Trp Gin Gly Gin Ser Pro 

D 1 5 10 15 

O Phe Phe Glu Gly Trp Tyr Val Arg Leu Leu Leu Pro Gin Ser Gly Glu 
30 20 25 30 

Ser Phe Ala Phe Met Tyr Ser lie Glu Asn Pro Ala Ser Asp His His 

35 40 45 

Tyr Gly Gly Gly Ala Val Gin lie Leu Gly Pro Ala Thr Lys Lys Gin 
50 55 60 

35 Glu Asn Gin Glu Asp Gin Leu Val Trp Arg Thr Phe Pro Ser Val Lys 
65 70 75 80 

Lys Phe Trp Ala Ser Pro Arg Gin Phe Ala Leu Gly His Trp Gly Lys 

85 90 95 

Cys Arg Asp Asn Arg Gin Ala Lys Pro Leu Leu Ser Glu Glu Phe Phe 
40 100 105 110 

Ala Thr Val Lys Glu Gly Tyr Gin lie His Gin Asn Gin His Gin Gly 

115 120 125 

Gin lie lie His Gly Asp Arg His Cys Arg Trp Gin Phe Thr Val Glu 
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130 135 140 

Pro Glu Val Thr Trp Gly Ser Pro Asn Arg Phe Pro Arg Ala Thr Ala 
145 150 155 160 

Gly Trp Leu Ser Phe Leu Pro Leu Phe Asp Pro Gly Trp Gin lie Leu 
5 165 170 175 

Leu Ala Gin Gly Arg Ala His Gly Trp Leu Lys Trp Gin Arg Glu Gin 

180 185 190 

Tyr Glu Phe Asp His Ala Leu Val Tyr Ala Glu Lys Asn Trp Gly His 
195 200 205 

10 Ser Phe Pro Ser Arg Trp Phe Trp Leu Gin Ala Asn Tyr Phe Pro Asp 
210 215 220 

His Pro Gly Leu Ser Val Thr Ala Ala Gly Gly Glu Arg He Val Leu 
225 230 235 240 

Gly Arg Pro Glu Glu Val Ala Leu He Gly Leu His His Gin Gly Asn 
15 245 250 255 

Phe Tyr Glu Phe Gly Pro Gly His Gly Thr Val Thr Trp Gin Val Ala 
Q 260 265 270 

J) Pro Trp Gly Arg Trp Gin Leu Lys Ala Ser Asn Asp Arg Tyr Trp Val 
in 275 280 285 

2p Lys Leu Ser Gly Lys Thr Asp Lys Lys Gly Ser Leu Val His Thr Pro 
5S 290 295 300 

Thr Ala Gin Gly Leu Gin Leu Asn Cys Arg Asp Thr Thr Arg Gly Tyr 
S 305 310 315 320 

a Leu Tyr Leu Gin Leu Gly Ser Val Gly His Gly Leu He Val Gin Gly 

Ife 325 330 335 

*T Glu Thr Asp Thr Ala Gly Leu Glu Val Gly Gly Asp Trp Gly Leu Thr 
5 340 345 350 

f^" Glu Glu Asn Leu Ser Lys Lys Thr Val Pro Phe 
Q 355 360 

30 

<210> 40 
<211> 56 
<212> DNA 

<213> Artifical Sequence 

35 

<400> 40 

cgcgatttaa atggcgcgcc ctgcaggcgg ccgcctgcag ggcgcgccat ttaaat 56 

<210> 41 
40 <211> 32 
<212> DNA 

<213> Artifical Sequence 



28 



Attorney Docket No: 
17133/02AJS 



<400> 41 

32 

tcgaggatcc gcggccgcaa gcttcctgca gg 

<210> 42 
<211> 32 
<212> DNA 

<213> Artifical Sequence 
<400> 42 

32 

tcgacctgca ggaagcttgc ggccgcggat cc 

<210> 43 
<211> 32 
<212> DNA 

<213> Artifical Sequence 
<400> 43 

32 

tcgacctgca ggaagcttgc ggccgcggat cc 

<210> 44 
<211> 32 
<212> DNA 

<213> Artifical Sequence 
<400> 44 

32 

tcgaggatcc gcggccgcaa gcttcctgca gg 

<210> 45 
<211> 36 
<212> DNA 

<213> Artifical Sequence 
<400> 45 

3 6 

tcgaggatcc gcggccgcaa gcttcctgca ggagct 

<210> 46 
<211> 28 
<212> DNA 

<213> Artifical Sequence 
<400> 46 

2 8 

cctgcaggaa gcttgcggcc gcggatcc 
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<210> 47 
<211> 36 
<212> DNA 

<213> Artifical Sequence 
<400> 47 

tcgacctgca ggaagcttgc ggccgcggat ccagct 

<210> 48 
<211> 28 
<212> DNA 

<213> Artifical Sequence 
<400> 48 

ggatccgcgg ccgcaagctt cctgcagg 

■<210> 49 
<211> 39 
<212> DNA 

<213> Artifical Sequence 
<400> 49 

gatcacctgc aggaagcttg cggccgcgga tccaatgca 



<210> 50 
<211> 31 
<212> DNA 

<213> Artifical Sequence 
<400> 50 

ttggatccgc ggccgcaagc ttcctgcagg t 

<210> 51 
<211> 41 
<212> DNA 

<213> Artifical Sequence 
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<400> 51 

ggatccgcgg ccgcacaatg gagtctctgc tctctagttc t 

<210> 52 
<211> 38 
<212> DNA 
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10 



<213> Artifical Sequence 



<400> 52 

ggatcctgca ggtcacttca aaaaaggtaa cagcaagt 3 8 

<210> 53 
<211> 45 
<212> DNA 

<213> Artifical Sequence 
<400> 53 

ggatccgcgg ccgcacaatg gcgttttttg ggctctcccg tgttt 45 

<210> 54 
15 <211> 40 

<212> DNA 
f*% <213> Artifical Sequence 

U7 <400> 54 

!fp ggatcctgca ggttattgaa aacttcttcc aagtacaact 40 

1* <210> 55 
5£ <211> 38 
":• <212> DNA 

SN5 <213> Artifical Sequence 
<400> 55 

f2 ggatccgcgg ccgcacaatg tggcgaagat ctgttgtt 3 8 

30 <210> 56 
<211> 37 
<212> DNA 

<213> Artifical Sequence 
35 <400> 56 

ggatcctgca ggtcatggag agtagaagga aggagct 37 

<210> 57 
<211> 50 
40 <212> DNA 

<213> Artifical Sequence 

<400> 57 
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ggatccgcgg ccgcacaatg gtacttgccg aggttccaaa gcttgcctct 

<210> 58 
<211> 38 
5 <212> DNA 

<213> Artifical Sequence 

<400> 58 

ggatcctgca ggtcacttgt ttctggtgat gactctat 

10 

<210> 59 
<211> 38 
<212> DNA 

<213> Artifical Sequence 

15 

<400> 59 

ggatccgcgg ccgcacaatg acttcgattc tcaacact 

p <210> 60 

Zp <211> 36 

€1 <212> DNA 

*3 <213> Artifical Sequence 

W <400> 60 

gfi ggatcctgca ggtcagtgtt gcgatgctaa tgccgt 

H; <210> 61 
iz <211> 22 
J±f <212> DNA 

TO <213> Artifical Sequence 
<400> 61 

taatgtgtac attgtcggcc tc 

35 <210> 62 
<211> 60 
<212> DNA 

<213> Artifical Sequence 
40 <400> 62 

gcaatgtaac atcagagatt ttgagacaca acgtggcttt ccacaattcc ccgcaccgtc 60 
<210> 63 



38 
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<211> 22 
<212> DNA 

<213> Artifical Sequence 
<400> 63 

22 

aggctaataa gcacaaatgg ga 

<210> 64 
<211> 63 
<212> DNA 

<213> Artifical Sequence 
<400> 64 

ggtatgagtc agcaacacct tcttcacgag gcagacctca gcggaattgg tttaggttat 60 

63 

ccc 

<210> 65 
<211> 26 
<212> DNA 

<213> Artifical Sequence 
<400> 65 

ggatccatgg ttgcccaaac cccatc 

<210> 66 
<211> 61 
<212> DNA 

<213> Artifical Sequence 
<400> 66 

gcaatgtaac atcagagatt ttgagacaca acgtggcttt gggtaagcaa caatgaccgg 60 

61 



<210> 67 
<211> 25 
<212> DNA 

<213> Artifical Sequence 
<400> 67 

gaattctcaa agccagccca gtaac 



<210> 68 
<211> 63 
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<212> DNA 

<213> Artifical Sequence 
<400> 68 

ggtatgagtc agcaacacct tcttcacgag gcagacctca gcgggtgcga aaagggtttt 60 

63 

ccc 

<210> 69 
<211> 23 
<212> DNA 

<213> Artifical Sequence 
<400> 69 

23 

ccagtggttt aggctgtgtg gtc 

<210> 70 
<211> 21 
<212> DNA 

<213> Artifical Sequence 
<400> 70 

21 

ctgagttgga tgtattggat c 

<210> 71 
<211> 28 
<212> DNA 

<:213> Artifical Sequence 
.<400> 71 

2 8 

ggatccatgg ttacttcgac aaaaatcc 

<210> 72 
<211> 60 
<212> DNA 

<213> Artifical Sequence 
<400> 72 

gcaatgtaac atcagagatt ttgagacaca acgtggcttt gctaggcaac cgcttagtac 60 

<210> 73 

<211> 28 

<212> DNA 

<213> Artifical Sequence 
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<400> 73 

gaattcttaa cccaacagta aagttccc 28 

5 <210> 74 
<211> 63 
<212> DNA 

<213> Artifical Sequence 
10 <400> 74 

ggtatgagtc agcaacacct tcttcacgag gcagacctca gcgccggcat tgtcttttac 60 
atg 63 

<210> 75 
15 <211> 20 

<212> DNA 
r™ : <213> Artifical Sequence 

U1 <400> 75 

iR) ggaacccttg cagccgcttc 20 

% <210> 76 

4 <211> 22 

<s <212> DNA 

! 2S <213> Artifical Sequence 

: 'Z <400> 76 

fH gtatgcccaa ctggtgcaga gg 22 

30 <210> 77 
<211> 28 
<212> DNA 

<213> Artifical Sequence 
35 <400> 77 

ggatccatgt ctgacacaca aaataccg 28 

<210> 78 
<211> 62 
40 <212> DNA 

<213> Artifical Sequence 

<400> 78 
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gcaatgtaac atcagagatt ttgagacaca acgtggcttt cgccaatacc agccaccaac 60 
ag 62 

<210> 79 
5 <211> 27 
<212> DNA 

<213> Artifical Sequence 
<400> 79 

10 gaattctcaa atccccgcat ggcctag 27 

<210> 80 
<211> 65 
<212> DNA 
15 <213> Artifical Sequence 

<400> 80 

Ji ggtatgagtc agcaacacct tcttcacgag gcagacctca gcggcctacg gcttggacgt 60 
W gtggg 65 
# 

|f <210> 81 
^ <211> 21 
fQ <212> DNA 

s <213> Artifical Sequence 

@5 

<400> 81 

"«t cacttggatt cccctgatct g 21 

p <210> 82 
30 <211> 21 
<212> DNA 

<213> Artifical Sequence 
<400> 82 

3 5 gcaatacccg cttggaaaac g 21 

<210> 83 

<211> 29 

<212> DNA 

40 <213> Artifical Sequence 

<400> 83 

ggatccatga ccgaatcttc gcccctagc 29 
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<210> 84 
<211> 61 
<212> DNA 

<213> Artifical Sequence 
<400> 84 

gcaatgtaac atcagagatt ttgagacaca acgtggcttt caatcctagg tagccgaggc 60 

<210> 85 
<211> 27 
<212> DNA 

<213> Artifical Sequence 
<400> 85 

gaattcttag cccaggccag cccagcc 27 

<210> 86 
<211> 66 
<212> DNA 

<213> Artifical Sequence 
<400> 86 

ggtatgagtc agcaacacct tcttcacgag gcagacctca gcggggaatt gatttgttta 60 
attacc 66 

<210> 87 
<211> 21 
<212> DNA 

<213> Artifical Sequence 
<400> 87 

gcgatcgcca ttatcgcttg g 21 

<210> 88 
<211> 24 
<212> DNA 

<213> Artifical Sequence 



<400> 88 

gcagactggc aattatcagt aacg 



24 



<210> 89 
<211> 25 
<212> DNA 

<213> Artifical Sequence 

5 

<400> 89 

ccatggattc gagtaaagtt gtcgc 

<210> 90 
10 <211> 0 

<213> Artifical Sequence 

<400> 90 

gaattcactt caaaaaaggt aacag 

15 

<210> 91 
p. <211> 4550 
£ <212> DNA 
"J- <213> Arabidopsis sp 

^ <400> 91 

% attttacacc aatttgatca cttaactaaa 
fff tttttgagca ttaaaccata aaaccatagt 
- cgattaagat taggaaaaat ttataaccgg 

525 taaatgccga ttcctccctt gtctaaaaga 
tgtttcactc tatttaattt caggcacaat 
5 -™ caacacgtga tacttttcct cgtccgtcag 
«i„ caaatctaca ccacattttt tgcttaatct 
rj agtctaacta attcttctaa tataagtaca 
30 taattttcaa aatctaatct aaatatctaa 
aatgacacca attaatcatc ctcgacccac 
ttttttgctc tctgttcctt caaaatcatt 
ttctttgtct ttgatttttg attttttttc 
atggagtctc tgctctctag ttcttctctt 
35 gtttcaggtt ttatttgttg tttaggtttc 
ttgaactttt ctgaatataa aataaggaaa 
tagatcgaag taggtgacaa aggttattgt 
tgaattttgt ttctcatgca tgcaacttat 
cagaatctaa agctccactc tttatcaggt 
40 aatactcaat catcttagtc tcattattct 
tttatgagac aatgtatgtt ggacttagtt 
gttactgatg ttgtttagct ctttacacca 
gttctgcgtt gtgattcgag taaagttgtc 
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ttaattaaat 


tagatgatta 


tcccaccata 


60 


tataagtaac 


tgttttaatc 


gaatatgact 


120 


taattaagaa 


aacattaacc 


gtagtaaccg 


180 


cagaaaacat 


atattttatt 


ttgccccata 


240 


acttttggtt 


ggtaacaaaa 


ctaaaaagga 


300 


tcagattttt 


tttaaactag 


aaacaagtgg 


360 


attaacttgt 


aagttttaaa 


ttcctaaaaa 


420 


ttccctaaat 


ttcccaaaaa 


gtcaaattaa 


480 


taattcaaaa 


tcattaaaaa 


gacacgcaac 


540 


acaattctac 


agttctcatg 


ctaaaccata 


600 


tctttctctt 


ctttgattcc 


caaagatcac 


660 


tctctggcgt 


gaaggaagaa 


gctttatttc 


720 


gtttccgctg 


gtaaatctcg 


tccttttctg 


780 


gtttttgtga 


ttcagaacca 


tacaaaaagt 


840 


aagtttcgat 


ttttataatg 


aattgtttac 


900 


gtggagaagc 


ataatttctg 


ggcttgactt 


960 


caatcagctg 


gtgggttttg 


ttggaagaag 


1020 


tcgttagggt 


tttatgggtt 


tttgaaatta 


1080 


attggttgaa 


tcacattttc 


taatttggaa 


1140 


gaagttcttc 


tctttggtta 


tagttgaagt 


1200 


atatatacac 


ccaattttgc 


agaaatccga 


1260 


gcaaaaccga 


agtttaggaa 


caatcttgtt 


1320 
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aggcctgatg 


gtcaaggatc 


ttcattgttg 


ttgtatccaa 


aacataagtc 


gagatttcgg 


1 O O A 

13 80 




gttaatgcca 


ctgcgggtca 


gcctgaggct 


ttcgactcga 


atagcaaaca 


gaagtctttt 


1440 




agagactcgt 


tagatgcgtt 


ttacaggttt 


tctaggcctc 


atacagttat 


tggcacagtt 


i c a a 

1500 




aagtttctct 


ttaaaaatgt 


aactctttta 


aaacgcaatc 


tttcagggtt 


ttcaaggaga 


1560 


D 


taacattagc 


tctgtgattg 


gatttgcagg 


tgcttagcat 


tttatctgta 


tctttcttag 


1620 




cagtagagaa 


ggtttctgat 


atatctcctt 


tacttttcac 


tggcatcttg 


gaggtaatga 


1 C O A 




atatataaca 


cataatgacc 


gatgaagaag 


atacattttt 


ttcgtctctc 


tgtttaaaca 


1740 




attgggtttt 


gttttcaggc 


tgttgttgca 


gctctcatga 


tgaacattta 


catagttggg 


1800 




ctaaatcagt 


tgtctgatgt 


tgaaatagat 


aaggtaacat 


gcaaattttc 


ttcatatgag 


1860 


1U 


ttcgagagac 


tgatgagatt 


aatagcagct 


agtgcctaga 


tcatctctat 


gtgggttttt 


1920 




gcaggttaac 


aagccctatc 


ttccattggc 


atcaggagaa 


tattctgtta 


acaccggcat 


1980 




tgcaatagta 


gcttccttct 


ccatcatggt 


atggtgccat 


tttcacaaaa 


tttcaacttt 


2040 




tagaattcta 


taagttactg 


aaatagtttg 


ttataaatcg 


ttatagagtt 


tctggcttgg 


2100 




gtggattgtt 


ggttcatggc 


cattgttctg 


ggctcttttt 


gtgagtttca 


tgctcggtac 


2160 


1 c 

lb 


tgcatactct 


atcaatgtaa 


gtaagtttct 


caatactaga 


atttggctca 


aatcaaaatc 


2220 




tgcagtttct 


agttttaggt 


taatgaggtt 


ttaataactt 


acttctacta 


caaacagttg 


2280 




ccacttttac 


ggtggaaaag 


atttgcattg 


gttgcagcaa 


tgtgtatcct 


cgctgtccga 


2340 


% 


gctattattg 


ttcaaatcgc 


cttttatcta 


catattcagg 


tactaaacca 


ttttccttat 


2400 




gttttgtagt 


tgttttcatc 
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tgtggctcaa 


cctccggtgg 


cttatgcctc 


tgctgcaccg 


tttcctttcc 


2400 




tcccagctcc 


ttccttctac 


tctccatgat 


aacctttaag 


caagctattg 


aatttttgga 


2460 




aacagaaatt 


aaaaaaaaaa 


tctgaaaagt 


tcttaagttt 


aatctttggt 


taataatgaa 


2520 




gtggagaacg 


catacaagtt 


tatgtatttt 


ttctcatctc 


cacataattg 


tattttttct 


2580 


15 


ctaagtatgt 


ttcaaatgat 


acaaaataca 


tactttatca 


attatctgat 


caaattgatg 


2640 




aatttttgag 


ctttgacgtg 


ttaggtctat 


ctaataaacg 


tagtaacgaa 


tttggttttg 


2700 




gaaatgaaat 


ccgataaccg 


atgatggtgt 


agagttaaac 


gattaaaccg 


ggttggttaa 


2760 




aggtctcgag 


tctcgacggc 


tgcggaaatc 


ggaaaatcac 


gattgaggac 


tttgagctgc 


2820 




cacgaagatg 


gcgatgaggt 


tgaaatcaat 








2850 



2p 



^ <210> 94 

^ <211> 3660 

<212> DNA 

fr; <213> Arabidopsis sp 
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<400> 94 














tatttgtatt 


tttattgtta 


aattttatga 


tttcacccgg 


tatatatcat 


cccatattaa 


60 


tattagattt 


attttttggg 


ctttatttgg 


gttttcgatt 


taaactgggc 


ccattctgct 


120 


tcaatgaaac 


cctaatgggt 


tttgtttggg 


ctttggattt 


aaaccgggcc 


cattctgctt 


180 


caatgaaggt 


cctttgtcca 


acaaaactaa 


catccgacac 


aactagtatt 


gccaagagga 


240 


tcgtgccaca 


tggcagttat 


tgaatcaaag 


gccgccaaaa 


ctgtaacgta 


gacattactt 


300 


atctccggta 


acggacaacc 


actcgtttcc 


cgaaacagca 


actcacagac 


tcacaccact 


360 


ccagtctccg 


gcttaactac 


caccagagac 


gattctctct 


tccgtcggtt 


ctatgacttc 


420 


gattctcaac 


actgtctcca 


ccatccactc 


ttccagagtt 


acctccgtcg 


atcgagtcgg 


480 


agtcctctct 


cttcggaatt 


cggattccgt 


tgagttcact 


cgccggcgtt 


ctggtttctc 


540 


gacgttgatc 


tacgaatcac 


ccggtagtta 


gcattctgtt 


ggatagattg 


atgaatgttt 


600 


tcttcgattt 


tttttttact 


gatcttgttg 


tggatctctc 


gtagggcgga 


gatttgttgt 


660 


gcgtgcggcg 


gagactgata 


ctgataaagg 


tatgattttt 


tagttgtttt 


tattttctct 


720 


ctcttcaaaa 


ttctcttttc 


aaacactgtg 


gcgtttgaat 


ttccgacggc 


agttaaatct 


780 


cagacacctg 


acaaggcacc 


agccggtggt 


tcaagcatta 


accagcttct 


cggtatcaaa 


840 


ggagcatctc 


aagaaactgt 


aattttgttc 


atctcctcag 


aatcttttaa 


attatcatat 


900 


ttgtggataa 


tgatgtgtta 


gtttaggaat 


tttcctacta 


aaggtaatct 


cttttgagga 


960 


caagtcttgt 


ttttagctta 


gaaatgatgt 


gaaaatgttg 


tttgttagct 


aaaaagagtt 


1020 
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tgttgttata 


ttctgtattc 


agaataaatg 


gaagattcgt 


cttcagctta 


caaaaccagt 


1080 


cacttggcct 


ccactggttt 


ggggagtcgt 


ctgtggtgct 


gctgcttcag 


gtaatcatac 


1140 


gaacctcttt 


tggatcatgc 


aatactgtac 


agaaagtttt 


ttcattttcc 


ttccaattgt 


1200 


ttcttctggc 


agggaacttt 


cattggaccc 


cagaggatgt 


tgctaagtcg 


attctttgca 


1260 


tgatgatgtc 


tggtccttgt 


cttactggct 


atacacaggt 


ctggttttac 


acaacaaaaa 


1320 


gctgacttgt 


tcttattcta 


gtgcatttgc 


ttggtgctac 


aataacctag 


acttgtcgat 


1380 


ttccagacaa 


tcaacgactg 


gtatgataga 


gatatcgacg 


caattaatga 


gccatatcgt 


1440 


ccaattccat 


ctggagcaat 


atcagagcca 


gaggtaactg 


agacagaaca 


ttgtgagctt 


1500 


ttatctcttt 


tgtgattctg 


atttctcctt 


actccttaaa 


atgcaggtta 


ttacacaagt 


1560 


ctgggtgcta 


ttattgggag 


gtcttggtat 


tgctggaata 


ttagatgtgt 


gggtaagttg 


1620 


gcccttctga 


cattaactag 


tacagttaaa 


gggcacatca 


gatttgctaa 


aatcttccct 


1680 


tatcaggcag 


ggcataccac 


tcccactgtc 


ttctatcttg 


ctttgggagg 


atcattgcta 


1740 


tcttatatat 


actctgctcc 


acctcttaag 


gtaagtttta 


ttcctaactt 


ccactctcta 


1800 


gtgataagac 


actccatcca 


agttttggag 


ttttgaatat 


cgatatctga 


actgatctca 


1860 


ttgcagctaa 


aacaaaatgg 


atgggttgga 


aattttgcac 


ttggagcaag 


ctatattagt 


1920 


ttgccatggt 


aagatatctc 


gtgtatcaat 


aatatatggc 


gttgttctca 


tctcattgat 


1980 


ttgtttcttg 


ctcacttgac 


tgataggtgg 


gctggccaag 


cattgtttgg 


cactcttacg 


2040 


ccagatgttg 


ttgttctaac 


actcttgtac 


agcatagctg 


gggtactctt 


ttggcaaacc 


2100 


ttttatgttg 


cttttttcgt 


tatctgttgt 


aatatgctct 


tgcttcatgt 


tgtacctttg 


2160 


tgataatgca 


gttaggaata 


gccattgtta 


acgacttcaa 


aagtgttgaa 


ggagatagag 


2220 


cattaggact 


tcagtctctc 


ccagtagctt 


ttggcaccga 


aactgcaaaa 


tggatatgcg 


2280 


ttggtgctat 


agacattact 


cagctttctg 


ttgccggtat 


gtactatcca 


ctgtttttgt 


2340 


gcagctgtgg 


cttctatttc 


ttttccttga 


tcttatcaac 


tggatattca 


ccaatggtaa 


2400 


agcacaaatt 


aatgaagctg 


aatcaacaaa 


ggcaaaacat 


aaaagtacat 


tctaatgaaa 


2460 


tgagctaatg 


aagaggaggc 


atctactttt 


atgtttcatt 


agtgtgattg 


atggattttc 


2520 


atttcatgct 


tctaaaacaa 


gtattttcaa 


cagtgtcatg 


aaataacaga 


acttatatct 


2580 


tcatttgtac 


ttttactagt 


ggatgagtta 


cacaatcatt 


gttatagaac 


caaatcaaag 


2640 


gtagagatca 


tcattagtat 


atgtctattt 


tggttgcagg 


atatctatta 


gcatctggga 


2700 


aaccttatta 


tgcgttggcg 


ttggttgctt 


tgatcattcc 


tcagattgtg 


ttccaggtaa 


2760 


agacgttaac 


agtctcacat 


tataattaat 


caaattcttg 


tcactcgtct 


gattgctaca 


2820 


ctcgcttcta 


taaactgcag 


tttaaatact 


ttctcaagga 


ccctgtcaaa 


tacgacgtca 


2880 


agtaccaggt 


aagtcaactt 


agtacacatg 


tttgtgttct 


tttgaaatat 


ctttgagagg 


2940 


tctcttaatc 


agaagttgct 


tgaaacactc 


atcttgatta 


caggcaagcg 


cgcagccatt 


3000 


cttggtgctc 


ggaatatttg 


taacggcatt 


agcatcgcaa 


cactgaaaaa 


ggcgtatttt 


3060 


gatggggttt 


tgtcgaaagc 


agaggtgttg 


acacatcaaa 


tgtgggcaag 


tgatggcatc 


3120 


aactagttta 


aaagattttg 


taaaatgtat 


gtaccgttat 


tactagaaac 


aactcctgtt 


3180 


gtatcaattt 


agcaaaacgg 


ctgagaaatt 


gtaattgatg 


ttaccgtatt 


tgcgctccat 


3240 


ttttgcattt 


cctgctcata 


tcgaggattg 


gggtttatgt 


tagttctgtc 


acttctctgc 


3300 


tttcagaatg 


tttttgtttt 


ctgtagtgga 


ttttaactat 


tttcatcact 


ttttgtattg 


3360 


attctaaaca 


tgtatccaca 


taaaaacagt 


aatatacaaa 


aatgatactt 


cctcaaactt 


3420 


tttataatct 


aaatctaaca 


actagctagt 


aacccaacta 


acttcataca 


attaatttga 


3480 


gaaactacaa 


agactagact 


atacatatgt 


tatttaacaa 


cttgaaactg 


tgttattact 


3540 


acctgatttt 


tttctattct 


acagccattt 


gatatgctgc 


aatcttaaca 


tatcaagtct 


3600 
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cacgttgttg gacacaacat actatcacaa gtaagacacg aagtaaaacc aaccggcaac 3660 
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